Hiberfile.sys Removal Note to Self

A very large portion of my system drive, a 250GB SSD, seemed to be gobbled up with my fresh Windows 8.1 install and after install all my tools, I was fast running out of disk space on the C: drive. A quick search for culprits using Effective Search from SowSoft turned up a 64GB file called hiberfil.sys.

After a little hunting and poking, I found the GUI for power management options and tried to turn off hibernate. But that did not get rid of the file.

Not until I found and used the following in an “as Administrator” cmd window did I recover the 64GB of SSD space:

powercfg -h off

And yes, I have 64GB of RAM. Call me spoiled.

MongoDB C# Driver 2.0 AsQueryable Alternative

I’m a fan of LINQ and the IQueryable<T> interface power to compose dynamic queries based on input parameters. So when I needed to compose just such a query in a repository for a MongoDB collection, I found that the MongoDB C# 2.0.1 driver is currently missing the AsQueryable method which is slated for 2.1 release of the driver.

After a little searching, I found the Filter Definition Builder documentation and a happy alternative to building up or composing a query based on the presence of query parameters.

And here is a code sample:

public async Task<IEnumerable<MyData>> GetByQuery(MyDataQuery query)
{
    var filter = ComposeFilter(query);
    if (!query.Ascending)
    {
        var responses = await _invoiceCollection
            .Find(filter)
            .SortByDescending(x => x.TransactionDate)
            .Skip(query.Index)
            .Limit(query.Limit)
            .ToListAsync();
        return responses;
    }
    var ascResponses = await _invoiceCollection
        .Find(filter)
        .SortBy(x => x.TransactionDate)
        .Skip(query.Index)
        .Limit(query.Limit)
        .ToListAsync();
    return ascResponses;
}

private FilterDefinition<MyData> ComposeFilter(MyDataQuery query)
{
    var builder = Builders<MyData>.Filter;
    var filter = builder.Eq(x => x.CustomerId, query.CustomerId);
    if (query.StartDate.HasValue)
    {
        filter = filter & builder.Gte(x => x.TransactionDate, query.StartDate.Value);
    }

    if (query.EndDate.HasValue)
    {
        filter = filter & builder.Lt(x => x.TransactionDate, query.EndDate.Value);
    }

    if (query.InvoiceId != null)
    {
        filter = filter & builder.Eq(x => x.InvoiceID, query.InvoiceId);
    }

    if (query.ItemId != null)
    {
        filter = filter & builder.Eq(x => x.ItemID, query.ItemId);
    }
    return filter;
}

So now you have a dynamic query based on composition logic. Of course this is a very simple example and there are bound to be far more sophisticated ways of doing the same thing, but I found this one worked very well for me.

Microsoft Graph Engine

This tweet from Scott Hanselman caught my eye because I spent nearly all of 2014 working on a graph solution for my employer that had it’s genesis in my study of Neo4J but primarily in my reading of this Microsoft Research paper (Sakr, Elnikety, and He) produced in 2012.

graphtweet

The essence of the Microsoft Research paper is storing edges (node relationships) in memory. So that’s what I did with unmanaged memory allocated in blocks using the Marshal.AllocHGlobal method. My own efforts were very specific with respect to my employer’s needs at the time and not really useable as a general purpose tool, so I was very pleased to see that Microsoft Research had an ongoing project called Trinity working to produce a more general purpose tool based on many of the same concepts originally explored by Sakr, Elnikety and He.

That tool was recently quietly released as the Microsoft Graph Engine. I’ve only had a little time to explore and understand it and look forward to spending more time using it soon. The essence is the same. Store the data in sequential chunks in raw unmanaged memory. Graph Engine uses Visual Studio to generate code on the fly using a meta language called Trinity Specification Language (TSL). Have a look through the documentation. If you’re considering graph database work, put Microsoft’s Graph Engine on your list of items to evaluate.

Blog Vacation is Over

It's been seven months and two job changes and crazy busy with family, work and life.

Vacation is over.

List of things to blog about.

So much to say, so little time to say it.

Git and PowerShell Environment Setup Notes

As a visually oriented developer, I remain command-line challenged. And yet there are some things that seem to be more productive in command-line form, and git seems to be one of them. I’ve not tried every GUI incarnation of a git tool, so I would be happy to be proven wrong about that. Until then, here are my notes on setting up my local machine with Git and PowerShell.

1. Download and install GitHub for Windows from https://windows.github.com/.

2. Create this directory: {home}\Documents\WindowsPowerShell

3. Add this file: profile.ps1 with these lines:

  • $env:path += ";" + (Get-Item "Env:ProgramFiles(x86)").Value + "\Git\bin"
  • $env:PSModulePath = $env:PSModulePath + ";{home}\Documents\WindowsPowerShell\Modules"

Where {home} is your user home directory.

4. Run PowerShell as Admin with following three commands (individually):

5. In PS, navigate to a local git repository directory. And let the magic begin.

6. Review http://think-like-a-git.net/ for a refresher.

Update: Change colors for better contrast, at least for my eyes.

gitcolor

A. Edit the GitPrompt.ps1 file in {home}\Documents\WindowsPowerShell\Modules\posh-git directory. Delete the “Dark” text from the colors.

B. Edit the .gitconfig file in the {home}\ directory and add the following lines to taste:

[core]
  quotepath = false
  whitespace=fix,-indent-with-non-tab,trailing-space,cr-at-eol
[color]
  diff = auto
  status = auto
  branch = auto
  interactive = auto
  ui = true
  pager = true
[color "branch"]
  current = yellow bold
  local = yellow
  remote = red
[color "diff"]
  meta = yellow bold
  frag = red bold
  old = red reverse
  new = green reverse
  whitespace = white reverse
[color "status"]
  added = yellow
  changed = green
  untracked = cyan reverse
  branch = red

Thanks and credit to many sources for coloring, but in particular to David DeSandro.

Very Large Pseudo Random Number Generator in C#

Thanks to a friend of mine, over the past few weeks the idea of zero knowledge authentication, which has been around for a long time but to which I’d never paid any attention, has been percolating in my head. Finally on Thursday (Thanksgiving holiday here in the U.S.), I dove in and added it to ServiceWire.

Now you can secure your ServiceWire RPC calls without PKI or exchanging keys or exposing a password across the wire. It validates on the server that the client knows the secret key (password) and on the client that the server knows the secret key. It creates a shared encryption key that is used for that one connection session and is never passed across the wire, allowing all subsequent communications between client and server to be encrypted with strong and fast symmetric encryption, denying any man in the middle from access to the data or even the service definition.

But this post is not about the entire zero knowledge addition to ServiceWire. Instead I want to share one part of that addition which you may use to create a very large random number. In this case, 512 bytes (that’s 4096 bits for those of you who can only count to 1). If you want a larger random number, you’ll need a larger safe prime from a Sophie Germain prime. In ServiceWire, I only include the 512 byte safe prime. There are a number of sources for larger safe primes freely available on the web.

This is not an algorithm entirely of my invention. It is based on several different algorithms I studied while investigating zero knowledge secure remote password protocols. I did take some liberties by adding a randomly selected smaller safe prime as the exponent in the algorithm.

And here’s the code:

using System.Numerics;

public class ZkProtocol
{
   private readonly SHA256 _sha;
   private readonly Random _random;
   private readonly BigInteger _n;

   public ZkProtocol()
   {
      _sha = SHA256.Create(); //used in other protocol methods
      _random = new Random(DateTime.Now.Millisecond);
      _n = new BigInteger(ZkSafePrimes.N4);
   }
   
   //...other protocol methods excluded for post

   /// <summary>
   /// Generate crypto safe, pseudo random number.
   /// </summary>
   /// <param name="bits">max value supported is 4096</param>
   /// <returns></returns>
   public byte[] CryptRand(int bits = 4096)
   {
      var rb = new byte[256];
      _random.NextBytes(rb);
      var bigrand = new BigInteger(rb);
      var crand = BigInteger.ModPow(bigrand, 
         ZkSafePrimes.GetSafePrime(_random.Next(0, 2047)), _n);
      var bytes = crand.ToByteArray();
      if (bits >= 4096) return bytes;
      var count = bits / 8;
      var skip = _random.Next(0, bytes.Length - count);
      return bytes.Skip(skip).Take(count).ToArray();
   }
}

internal static class ZkSafePrimes
{
   internal static byte[] N4
   {
      get { return Safe4096BitPrime; }
   }

   private static byte[] Safe4096BitPrime = new byte[]
   {
      //data not shown here for post (see on GitHub)
   };
   
   public static int GetSafePrime(int index)
   {
      if (index < 1) index = 1;
      else if (index > 2047) index = 2047;
      else if (index % 2 == 0) index++;
      return _safePrimes[index];
   }

   //2048 values, every second value is a safe primes 
   //the previous value is the Sophie Germain prime
   //as in q is the Sophie Germain and N = 2q+1 is safe
   private static int[] _safePrimes = new[]
   {
      5903,11807,8741,17483 //first four value only - see GitHub
   };
}

See the code on GitHub. Look specifically at the ZkProtocol and ZkSafePrimes classes.

I should note that in ServiceWire a new instance of ZkProtocol is used for each connection, so the Random object should provide a healthy random seed to the algoritm given the clock's milliseconds seed on construction. If you find this useful, I would love to hear from you.

ServiceMq Hits 10,000 Downloads

I am pleased to see this milestone of 10,000 downloads in the short history of ServiceMq and its underlying communication library ServiceWire, a faster and simpler alternative to WCF .NET to .NET RPC. And the source code for all three can be found on GitHub here.

smq10k

Over the past few weeks both libraries have been improved.

ServiceMq improvements include:

  • Options for the persistence of messages asynchronously to improve overall throughput when message traffic is high
  • ReceiveBulk and AcceptBulk methods were introduced
  • Message caching was refactored to improve performance and limit memory use in scenarios where large numbers of messages are sent and must wait for a destination to become available or received and must wait to be consumed
  • Faster asynchronous file deletion was added which eliminates the standard File.Delete’s permission demand on every message file delete
  • Asynchronous append file logging was added to improve throughput
  • The FastFile class was refactored to support IDisposable and now dedicates a single thread each to asynchronous delete, append and write operations
  • Upgraded to ServiceWire 1.6.3

ServiceWire has had two minor but important bugs fixed:

  • Code was refactored to properly dispose of resources when a connection failure occurs.
  • Previously if the host was not hosting the same assembly version of the interface being used, the connection would hang. This scenario now properly throws an identifiable exception on the client and disposes of the underlying socket or named pipe stream.

Real World Use

In the last month or so, I have had the opportunity to use both of these libraries extensively at work. All of the recent improvements are a direct or indirect result of that real world use. Without disclosing work related details, I believe it is safe to say that these libraries are moving hundreds of messages per second and in some cases 30GB of data between two machines in around three minutes across perhaps 300 RPC method invocations. Some careful usage has been required given our particular use cases in order to reduce connection contention from many thousands of message writer threads across a pool of servers all talking to a single target server. I’ve no doubt that a little fine tuning on the usage side may be required, but overall I’m very happy with the results.

I hope you enjoy these libraries and please contact me if you find any problems with them or need additional functionality. Better yet, jump onto GitHub and submit a pull request of your own. I am happy to evaluate and accept well thought out requests that are in line with my vision for keeping these libraries lightweight and easy to use.

One other note

I recently published ServiceMock, a tiny experimental mocking library that has surprisingly been downloaded over 500 times. If you’re one of the crazy ones, I’d love to hear from you and what you think of it.

Agile and Architecture

One of the most misunderstood and misrepresented documents in the history of software development is the Agile Manifesto. This may be due to many of its readers overlooking the phrase “there is value in the items on the right.” Most seem to focus on the items on the left only. Here’s the text that Cunningham, Fowler, Martin and other giants in the field created:

Manifesto for Agile Software Development

We are uncovering better ways of developing
software by doing it and helping others do it.
Through this work we have come to value:

Individuals and interactions over processes and tools
Working software over comprehensive documentation
Customer collaboration over contract negotiation
Responding to change over following a plan

That is, while there is value in the items on the right,
we value the items on the left more.

Note that I have emphasized the items on the right. These do indeed have value but so many advocates of Agile deliberately ignore and even exclude these from their software development process and organization. Some have advocated the elimination of architecture and design entirely, leaving these open to gradual discovery through the iterative process driven by use cases and user stories and backlog tasks.

Recently I have read a number of discussions, blog posts and articles on the question of Agile and architecture. The comments and discussion around the topic have been interesting. Perhaps this is due to the notion that developing software is only about writing the code. The general theme of these sources is that architecture (and design) are at odds with Agile. This is one of the great fallacies of our time.

Architecture and Design are Software Development Artifacts

Teams and organizations who skip architecture and design will sooner or later find themselves off track and repeating work unnecessarily. Such waste is not entirely preventable, but this does not mean we should not try. Teams that incorporate these activities into their iterations, regularly revisiting architectural questions such as non-functional requirements and component design, will find that they are better able to stay on course.

Organizations that have multiple teams will find greater stability in moving forward when guided by a centralized architecture team, comprised of architects or leads who are dedicated to and work within the organizations development teams. The architecture team works in an Agile fashion, with its own backlog and its own products including working prototypes, cross cutting standardized components, documentation of the architecture and designs, and work items to be placed on the backlogs of development teams.

In this way, development teams have dependencies on the architecture team and can make requests for additional guidance or improvements or extensions to shared, standardized libraries for which the architecture team is responsible. These requests keep the backlog of the architecture team charged with work throughout the software development lifecycle.

In addition to formal activities to improve architecture and design across the organization, the architecture team should regularly interact with development teams to include (but not limited to) the following:

  • practice improvement activities—e.g. SOLID principles
  • technology deep dives—e.g. digging deeper into .NET
  • technical solutions brainstorming sessions—solving the hard problems
  • technical debt evaluation and pay-down planning
  • code reviews and walkthroughs—one on one and as a team
  • presenting and sharing solutions and ideas from other development teams
  • exploring and evaluating new technologies and tools

Like testing and coding, architecture and design are a part of the whole of software development. These activities are perfectly suited to Agile development practices, including SCRUM. And when all of these aspects of delivering quality software are taken into account and incorporated into your Agile process, your chances of success are greatly improved.

----

P.S. And if you add to all this a great dev ops team to support your efforts with automated build and deploy systems, your life will be that much easier and your chances of success are automatically improved.