Interesting bug when porting Ubuntu One on Windows

by mandel on September 1st, 2010

I have been working for about 2 moths now and after releasing our internal alpha release I have found a very interesting bug. In our port to windows we have decided to try and make your live as nice as possible in an environment as rough as Windows and to do so we allow our port to auto-update to always deliver the latests bug fixes.

Ofcourse to ensure that we are updating you system with the correct data we always perform a check sum of the msi. While our msi is updating to S3 using python, the code that downloads it is C#. Here are the different codes to calcualte the checksum:

    fd = open(filename,'r')
    md5_hash = hashlib.md5(fd.read()).hexdigest()
private static string Checksum(AlgoirhtmsEnum algorithm, string filePath)
        {
            HashAlgorithm hash;
            switch (algorithm)
            {
                case AlgoirhtmsEnum.SHA1:
                    hash = new SHA1Managed();
                    break;
                case AlgoirhtmsEnum.SHA256:
                    hash = new SHA256Managed();
                    break;
                case AlgoirhtmsEnum.SHA384:
                    hash = new SHA384Managed();
                    break;
                case AlgoirhtmsEnum.SHA512:
                    hash = new SHA512Managed();
                    break;
                default:
                    hash = new MD5CryptoServiceProvider();
                    break;
 
            }
            var checksum = "";
            using (var stream = new FileStream(filePath, FileMode.Open, FileAccess.Read))
            {
                var md5 = hash.ComputeHash(stream);
                for (var i = 0; i < md5.Length; i++)
                {
                    checksum += md5[i].ToString("x2").ToLower();
                }
            }
            return checksum;
        }

Believe it or not the hash returned by each piece of code was different. WTF!!!! After a ridiculous time looking at it I managed to spot the issue. If you are using python on windows, unless you use the b option when opening a file, python will convert all the CRLF to LF making the hash to be different, how to fixed this? simply open the file this way:

fd = open(filename,'rb')

Go an figure….