tsJensen

A quest for software excellence...

Taking a Comment Holiday to Escape the Spammers

I upgraded to the latest version of dasBlog a few days ago and inadvertently allowed comments without requiring approval. A spambot comment got through and while I quickly turned on the "require approval" feature, it was too late. Since then I've been bombarded with stupid link spam comments. I even deleted the one post that seemed to be the bot target and created a new post with the same content.

No luck. After many similar spam comments today being posted to the most recent post on my blog, I'm giving up. I'm taking a comment holiday. It won't bother anyone really because I don't get many real comments. I'll enable the comment functionality some day in the future.

Meantime, if you have a comment, feel free to email me and I'll post it as an addendum to the relevant post.

Live Writer Source Code Format Plugin

I've just installed the plugin blogged about by Mike Ormond and here's an example of it's output taken from code in the Atrax project I've just published to Codeplex.

namespace Atrax.Library
{
   [DataContract, Serializable]
   public class QueryResult
   {
      /// <summary>
      /// The original query sent by the client.
      /// </summary>
      [DataMember]
      public Query Query { get; set; }

      /// <summary>
      /// Status code sent back to query client's callback url.
      /// </summary>
      [DataMember]
      public string StatusCode { get; set; }
      
      /// <summary>
      /// Status description sent back to query client's callback url.
      /// </summary>
      [DataMember]
      public string StatusDescription { get; set; }
      
      /// <summary>
      /// The XML schema for the result XML.
      /// </summary>
      [DataMember]
      public string ResultSchema { get; set; }

      /// <summary>
      /// The result produced by the query processor, usually XML.
      /// </summary>
      [DataMember]
      public string Result { get; set; }
   }
}

Testing Windows Live Writer with dasBlog

I've run into a little snag with my blog. I installed (rather copied) the latest release of dasblog to my bin (and other directories) and ended up not being able to edit or post new blog entries because of some installation (probably some config) issue with FreeTextBox. So while I figure out the problem or wait for someone else to solve it, I decided to try Windows Live Writer.

Here's the FreeTextBox and the text it displays now when attempting to edit or add a post:

I've tried the web.config change, including the one suggested by Scott Hanselman regarding the dependent assembly as follows:

<dependentAssembly>
    <assemblyIdentity name="FreeTextBox" publicKeyToken="5962a4e684a48b87" culture="neutral"/>
    <bindingRedirect oldVersion="3.0.5000.0-3.0.5000.6" newVersion="3.1.6.34851"/>
</dependentAssembly>

Well, time to publish this to see what it looks like.

(And second post with some edits.)

Atrax Keyword Extraction Algorithm

Two and a half years ago I wrote an implementation in C# of an algorithm published in 2003 in a short academic paper by Yutaka Matsuo and Mitsuru Ishizuka in the International Journal of Artificial Intelligence Tools. Of course, the algorithm is not a perfect implementation of the algorithm published in the "Keyword Extraction from a Single Document using Word Co-occurrence Statistical Information" paper. I made a number of decisions to make the algorithm as effective as possible while keeping it as fast as I could.

The code was written for Provo Labs, my employer at the time. I've recently obtained written permission from Provo Labs to release this code as open source under the Apache 2.0 license. You can get the code in the Atrax.Html project, a part of the entire Atrax project which I've just released, at http://www.codeplex.com/atrax. Here's the core of the code.

string[] terms = new string[termsG.Count];
termsG.Values.CopyTo(terms, 0); //gives terms array where last term is the MAX g in G
foreach (string w in terms)
{
    decimal sumZ = 0;
    for (int i = 0; i < terms.Length - 1; i++) //do calcs for all but MAX
    {
        string g = terms[i];
        if (w != g) //skip where on the diagonal
        {
            int nw = termNw[w];
            decimal Pg = termPg[g];
            decimal D = nw * Pg;
            if (D != 0.0m)
            {
                decimal Fwg = termFwg[w][terms[i]];
                decimal T = Fwg - D;
                decimal Z = (T * T) / D;
                sumZ += Z;
            }
        }
    }
    termsX2[w] = sumZ;
}

SortedDictionary<decimal, string> sortedX2 = new SortedDictionary<decimal, string>();
foreach (KeyValuePair<string, decimal> pair in termsX2)
{
    decimal x2 = pair.Value;
    while (sortedX2.ContainsKey(x2))
    {
        x2 = x2 - 0.00001m;
    }
    sortedX2.Add(x2, pair.Key);
}

//now get simple array of values as lowest to highest X2 terms
string[] x2Terms = new string[sortedX2.Count];
sortedX2.Values.CopyTo(x2Terms, 0);

I have not spent much time on this algorithm in the past two years and would like to find others with similar interests to help me improve and perfect it. If you have an interest in this kind of research, please join me at the Atrax project page on Codeplex.

Vista Defrag Woefully Inadequate - Enter O&O Defrag

Being rather new to Vista this week, I was sorely disappointed to see the severely dumbed down defrag utility in Vista. A pathetic effort. Really! So after a few highly scientific Google searches, I settled on O&O Defrag and could not be happier.

Here's the lame, incredibly useless UI in Vista's Disk Defragmenter. Note, if you are going to use some other defragmenter on a schedule, which I would recommend, be sure to disable the regularly scheduled Vista defragmenter by unchecking the box. One way of getting there is to go to the Control Panel and then Performance Information and Tools and then Advanced Tools.

And here is only part of the incredibly useful O&O Defrag UI, a shot taken as it defrags my drives:

Of course there are other suitable defrag tools such as DiskKeeper and others. Perhaps Microsoft wanted the Vista tool to cater only to the basic, uninformed user. If so, they certainly left the market wide open to the more sophisticated tools vendors such as O&O.

 

From XP Pro to Vista Ultimate x64

I finally took the plunge. Now I get to use 4GB out of 4GB except that the bare minimum I seem to be able to get Vista x64 down to is a 1.2GB footprint. And that's after hours and hours of experimentation and disabling some visual enhancements, though I feel no loss there and am experiencing a significantly reduced sense of loss.

Now I'm happy to be able to test on x64 virtual images using VMWare's Workstation, I'm afraid I may need to buy four 2GB sticks of RAM now. Despite the fact that the additional memory is available now, the larger footprint nearly wipes out the gain.

And that's without running any significant applications, except IE, which is quite a memory hog. I guess the old 640K upper limit days are over.

Yes, RAM is cheap. A quick check on Newegg.com and I found 8GB (4 x 2GB DDR2 800) for $174. I can't even buy three tanks of gas for my SUV for that.

Virtual PC 2007 vs VMWare Workstation 6.5

I'm getting ready to do some serious MOSS 2007 architecture and development work. In the past, I've used Virtual PC 2007 to host a virtual development environment running a Windows server operating system, SQL Server, MOSS and Visual Studio all running in the same virtual machine. And I've never been very happy with the performance of that virtual machine.

So today I decided to give VMWare a try and downloaded VMWare Workstation 6.5. I installed Windows Server 2008 Standard x86 (full install) on a new virtual machine with the same disk space and memory as I had allocated for the same operating system install using Virtual PC 2007. I gave both virtual machines 30GB of disk space and 1GB of RAM. I'm running on a Core 2 Duo 6600 on an ASUS P5B at factory default speed with 4GB of RAM with virtualization support enabled. Both virtual machines virtual drives live on the same drive.

The major advantage of VMWare is its ability to utilize both cores where Virtual PC is stuck with using just one. I'm sure there are additional reasons for the differences in performance. I used PerformanceTest 6.1 from PassMark. I'm sure there are other ways to test virtual machine performance, but this seemed to be a reasonable though unscientific approach. I made sure my machine was running the same processes and completely idle except for the virtual machine host application.

I only ran the tests that mattered to me: CPU, 2D, memory, and disk. I don't care about 3D and CD performance for the virtual machine. Here's the results:

vmware

test 1

test 2

avg

ratio

cpu: 326.6 344.4 335.5 2.2x
2D: 28.7 32.2 30.45 3.3x
Memory: 96.7 96.2 96.45 1.2x
Disk: 469.1 454.5 461.8 6.4x
Total: 921.1 927.3 924.2 2.9x
vpc 2007
cpu: 150.7 154.1 152.4
2D: 9.2 9.3 9.25
Memory: 83.3 83.2 83.25
Disk: 69.6 73.8 71.7
Total: 312.8 320.4 316.6
 

I was amazed to see that overall, the VMWare virtual machine ran 2.9 times faster than the Virtual PC machine. Even more amazing was the performance improvement of the 2D and disk tests, 3.3 and 6.4 times faster respectively.

I am now completely sold on the value of the VMWare Workstation license. The best price I found after a quick search was $161. For all the saved frustration in working with a slow virtual machine development image for MOSS, the product is well worth the price. But don't take my word for it, run your own tests if you don't believe me. Of course, if you aren't running a multicore machine, and what self respecting developer isn't, you probably won't see any improvement. On the other hand, if you have at least two cores, choosing save a few bucks seems to penny wise but pound foolish!

 

MSDN Subscription - Zupancic Heroic MVP

Props, kudos, and thanks a million to my good friend Aaron Zupancic, one of Microsoft's most valuable MVPs. Last week I pinged him to ask his opinion about a site I'd found adversting an VS 2008 Pro MSDN Pro 2 year sub for $999. It was a decent price but the site seemed a bit sketchy. He seemed to agree with my assessment and then asked if I'd like one of the complimentary VSTS 2008 MSDN Premium subscriptions that Microsoft had sent him with his MVP package. Wow!

That's over $10,000 worth of tools!

Thanks ten thousand times over, Aaron! You're awesome!

In exchange, I promise to faithfully attend the Utah user group meetings!

And before you inundate Aaron with begging, let me dispell the rumor that he has an unlimited supply. Aaron gave the other complinetary sub to another friend and user group supporter. And please don't nag him for picking me over you. Blame me, blind luck, and accidentally perfect timing!