Bitten in the ass by ReadDirectoryChangesW and multithreading

by mandel on April 20th, 2011

During the past few days I have been trying to track down an issue in the Ubuntu One client tests when ran on Windows that would use all the threads that the python process could have. As you can imaging finding out why there are deadlocks is quite hard, specially when I though that the code was thread safe, guess what? it wasn’t

The bug I had in the code was related to the way in which ReadDirectoryChangesW works. This functions has two different ways to be executed:

Synchronous

The ReadDirectoryChangesW can be executed in a sync mode by NOT providing a OVERLAPPED structure to perform the IO operations, for example:

def _watcherThread(self, dn, dh, changes):
        flags = win32con.FILE_NOTIFY_CHANGE_FILE_NAME
        while 1:
            try:
                print "waiting", dh
                changes = win32file.ReadDirectoryChangesW(dh,
                                                          8192,
                                                          False,
                                                          flags)
                print "got", changes
            except:
                raise
            changes.extend(changes)

The above example has the following two problems:

  • ReadDirectoryChangesW without an OVERLAPPED blocks infinitely.
  • If another thread attempts to close the handle while ReadDirectoryChangesW is waiting on it, the CloseHandle() method blocks (which has nothing to do with the GIL – it is correctly managed)

I got bitten in the ass by the second item which broke my tests in two different ways since it let thread block and a Handle used so that the rest of the tests could not remove the tmp directories that were under used by the block threads.

Asynchronous

In other to be able to use the async version of the function we just have to use an OVERLAPPED structure, this way the IO operations will no block and we will also be able to close the handle from a diff thread.

def _watcherThreadOverlapped(self, dn, dh, changes):
        flags = win32con.FILE_NOTIFY_CHANGE_FILE_NAME
        buf = win32file.AllocateReadBuffer(8192)
        overlapped = pywintypes.OVERLAPPED()
        overlapped.hEvent = win32event.CreateEvent(None, 0, 0, None)
        while 1:
            win32file.ReadDirectoryChangesW(dh,
                                            buf,
                                            False, #sub-tree
                                            flags,
                                            overlapped)
            # Wait for our event, or for 5 seconds.
            rc = win32event.WaitForSingleObject(overlapped.hEvent, 5000)
            if rc == win32event.WAIT_OBJECT_0:
                # got some data!  Must use GetOverlappedResult to find out
                # how much is valid!  0 generally means the handle has
                # been closed.  Blocking is OK here, as the event has
                # already been set.
                nbytes = win32file.GetOverlappedResult(dh, overlapped, True)
                if nbytes:
                    bits = win32file.FILE_NOTIFY_INFORMATION(buf, nbytes)
                    changes.extend(bits)
                else:
                    # This is "normal" exit - our 'tearDown' closes the
                    # handle.
                    # print "looks like dir handle was closed!"
                    return
            else:
                print "ERROR: Watcher thread timed-out!"
                return # kill the thread!

Using the ReadDirectoryW function in this way does solve all the other issues that are found on the sync version and the only extra overhead added is that you need to understand how to deal with COM events which is not that hard after you have worked with it for a little.

I leave this here for people that might find the same issue and for me to remember how much my ass hurt.

References

  • simple_elegance

    Thank you for this code! I tried you asynchronous method and it works, but I am having the following problem: When I try to delete the watched folder from python or from windows explorer, I get:

    WindowsError: [Error 32] The process cannot access the file because it is being used by another process: ‘c:usersuserappdatalocaltempnew_dir’

    I believe this makes sense, but how should I solve this? Because my application should allow the user to remove a watched folder. Do you have any idea?

    • http://mandel.themacaque.com mandel

      That is a known issue, the problem here is that you cannot delete the folder because the watcher contains a handle pointing to the directory which will stop a process from delete or renaming the directory.

      The simplest way to solve your problem is tho stop the watch and ensure that de CloseHandle method from COM is called. I’ll be posting a new version on this code that uses twisted and that has a much cleaner approach.

  • http://mandel.themacaque.com mandel

    If you can use twisted the following is a better implementation: http://www.themacaque.com/?p=900

  • Tim-Erwin

    Thanks for sharing this code (which can already be found elsewhere (e.g. http://www.java2s.com/Open-Source/Python/Windows/pyExcelerator/pywin32-214/win32/test/test_win32file.py.htm). However, here (Win 7) the ReadDirectoryChangesW() blocks although I provide an overlapped structure. I copy and paste your code (omitting the loop) line by line into the interpreter (python 2.7), and it hangs at ReadDirectoryChangesW() until I change something in the directory. I also tried closing the handle and waiting on muliple objects, setting one of them explicitely. Nothings seems to work. Any idea what I’m doing wrong?

  • http://mandel.themacaque.com mandel

    @Tim the problem you are having is the following:

    if you copy the code as it is (without the loop) what is happening is that your main thread of the interpreter gets block because no changes where made, The way to do this is to call the method from a different thread.

    For example, lets assume that copied the code in the example without the loop in a function called ‘get_changes’ the simplest way to execute the function in a diff thread is using the thrading module, for example:

    import threading
     
    t = threading.Thread(target=get_changes, args=your_args_go_here)
    t.start()

    Here the target (which has to be a callable object) is the function that will be executed in a diff thread, and the args are the arguments to be passed to the callable object.

    You can find more info on how to executed the function in a diff thread here: threading.Thread

    Edit: Please let me know if this solves your question.

  • Tim-Erwin

    Thanks a lot for taking your time for me. I think we need to clarify a few things here:
    * Does the loop change anything here? I’m asking because you’re stressing it. I only omitted it to make things easier.
    * I expected the thread not to wait on ReadDirectoryChangesW() but on WaitForSingleObject(), however, I don’t get past the former until a change is made. Am I mistaken?
    * I already created an application using Threads (tried your proposal anyways); surely, the application itself is not hanging then, but each single thread is. I thought I could wait for a change OR 5000 millis and then go on. Alternatively, I tried WaitForMultipleObjects((overlapped.hEvent, _stop_waiting), …) but SetEvent(_stop_waiting) has no effect.

    Maybe I have not understood the idea here. Your Watch class from the Ubuntu One watcher, however, seems to do it exactly as I’m trying to – and I assume it works :)

  • http://mandel.themacaque.com mandel

    @Tim lets look at the issue step by step:

    1. Does the loop change anything here?

      It make no difference besides the fact that you will only get those events that fill the Overlapped object and not more.

    2. I expected the thread not to wait on ReadDirectoryChangesW() but on WaitForSingleObject(), however, I don’t get past the former until a change is made. Am I mistaken?

      Your expectations are correct, and the code is indeed waiting in the WaitForSingleObject. take a close look at the code:

       win32file.ReadDirectoryChangesW(dh,
           buf,
           False, #sub-tree
           flags,
           overlapped)
       # Wait for our event, or for 5 seconds.
       rc = win32event.WaitForSingleObject(overlapped.hEvent, 5000)

      The code will never go passs the call from WaitForSingleObject until the ReadDirectoryChangesW writes events in the overlapped object and signals that it finished using it. Once it does we will continue and will get the return code, in this case stored in rc.

    3. I already created an application using Threads (tried your proposal anyways); surely, the application itself is not hanging then, but each single thread is.

      Yes, each of the threads will hang until either you get events, the timeout for waiting runs out or you send a _stop_waiting event. Where are you setting the _stop_waiting event?

    Is there anywhere where you store the code and I can take a look. It will be easier to help you if you show the entire code :)

  • Tim-Erwin

    Well, seems I got the idea right so far. What you wrote is exacly what I expected. Now the problem: The WaitForSingleObject() is not reached until a change is made in the watched folder. I pasted a simple demonstration here: http://pastebin.com/iVf1eKWw
    If you’d give me your email I could also send you the more complex version without spamming this page too much. Really appreciate your effort.

  • Pingback: ReadDirectoryChangesW, if you want it asyc use the correct flag | Macaque Project

  • http://mandel.themacaque.com mandel

    @Tim-Erwin I have written a blog post with a propoer response to you issues which you can find here.

    As you will read in the post the issue is not related to the ReadDirectoryChangesW call but with the CreateFile one.. yes, the windows API is THAT bad.

    I hope the code helps you and please let me know how it goes :)

    PS: You are not spamming the post at all, I love to have this kind of comments because they will help other programmers.