Modern C++

Long Filenames in Windows 8

Do you hate the 259-character filename limit in some Windows file systems? So does Kenny. Here's a way to eliminate it and free your application and users from the restriction.

Windows 8 finally addresses a longstanding issue with its support for long filenames. While the Windows file systems, notably NTFS, have supported long filenames for longer than I can remember, the Windows shell has been stuck with an antiquated limit of 259 characters. This is the effective limit imposed by the infamous MAX_PATH constant. Fortunately, the Windows shell is beginning to support longer paths; and while it doesn't yet go far enough, it does provide a new set of path management functions that fully support long filenames.

The Windows shell provides a set of so-called lightweight utility functions that provide various string manipulation routines. You can, for example, use the PathAppend function to append one path to the end of another, like this:

wchar_t path[10] { L"C:\\" };
VERIFY(PathAppend(path, L"PathTooLong"));

You just need to tell the compiler where to find this library:

#include <shlwapi.h>
#pragma comment(lib, "shlwapi")

Unfortunately, the PathAppend function just assumes that the buffer pointed to by the path variable is at least MAX_PATH, or 260 characters long, including space for a null terminator. While it returns a Boolean value indicating whether the operation succeeded, it won't help you in this case -- you'll be rewarded with stack corruption, if you're lucky. PathAppend expects that the buffer is declared as follows:

wchar_t path[MAX_PATH] { L"C:\\" }
;

MAX_PATH includes the null terminator, so there's no need to add a "+ 1" to the array's size. Now all is well, but this is problematic. In Windows 8, this family of path management functions has been deprecated and replaced with a new set of more secure functions that expect callers to provide the size of the buffer to avoid such problems.

To begin with, there's a direct replacement for functions like PathAppend:

wchar_t path[10] { L"C:\\" };

HRESULT const result = PathCchAppend(path,
                                     _countof(path),
                                     L"PathTooLong");

There are corresponding Cch functions for most of the original path management functions that the Windows shell originally provided. These functions come courtesy of a new header and lib file:

#include <pathcch.h>
#pragma comment(lib, "pathcch")

These functions all expect an indication of the size of the buffer being provided. They also return HRESULT values, rather than the simpler Boolean value used by the original functions. This means that far more useful error information may be gleaned. The original functions didn't make any further information available through the GetLastError function, so you were often left guessing.

In the example above, given that the buffer is in fact too small, PathCchAppend simply rejects the request and returns the ERROR_INSUFFICIENT_BUFFER error code wrapped in an HRESULT:

ASSERT(result == HRESULT_FROM_WIN32(ERROR_INSUFFICIENT_BUFFER));

But for some strange reason, this newer and safer function doesn't do anything to relieve the 259-character limitation on the path as a whole; it just makes the code you've always written a little more secure.

Fortunately, another new function called PathCchAppendEx does provide support for long paths, but only if you specifically request it with an optional flag:

result = PathCchAppendEx(path, _countof(path), L"Path", PATHCCH_ALLOW_LONG_PATHS);

In this case, if the combined path exceeds 259 characters, it will happily proceed as long as there's sufficient room in the buffer provided by the caller. This is really the function to use, and I don't see any reason why you should limit your applications and users to short paths.

Let's wrap this function up with a little help from C++. It may be more secure, but it's still somewhat error prone, as you need to remember or be sure to specify the correct buffer size.

Lest we overlook a path management error, let's use an exception to report such errors:

struct path_exception
{
    HRESULT code;

    path_exception(HRESULT const result) :
        code { result }
    {}
};

And I usually like to define a simple helper function to check the result of such functions:

auto check(HRESULT const result) -> void
{
    if (S_OK != result)
    {
        throw path_exception(result);
    }
}

Now we can write a simple helper function template and allow the compiler to deduce the buffer size for us:

template <unsigned Count>
auto path_append(wchar_t (&path)[Count],
                 wchar_t const * more) -> void
{
    check(PathCchAppendEx(path,
                          Count,
                          more,
                          PATHCCH_ALLOW_LONG_PATHS));
}

This now gives us the best of both worlds: the simpler interface of the older PathAppend function, the security of buffer size checking, and the assurance that the compiler knows what to do:

path_append(path, L"MorePath");

Given the PATHCCH_ALLOW_LONG_PATHS flag, the PathCchAppendEx function will also intelligently prefix the "\\?\" preamble to indicate that a path is longer than 259 characters, as required by the operating system's file and directory management functions. It will strip off the preamble if it determines that the path is within the short path limit, which is a nice touch.

Of course, even this is a little tedious as we're dealing with stack-allocated buffers. Eventually you may need to handle paths more naturally with standard C++ string objects. So let's see if we can write a version of this helper function that can free us from the stack and deal with string objects backed by the heap.

Consider a standard string representing a path:

auto path = std::wstring { L"C:\\" };

Since we're not dealing with a fixed buffer known at compile time, we need to think about how much additional storage is required prior to calling the PathCchAppendEx function. This is relatively simple. We need to cater for the following concatenation:

\\?\path\more

So there's the four-character preamble, followed by the original path, followed by a backslash, which separates it from the path being appended. That's five additional characters over and above the size of the original path, as well as the appended path. This the worst case. The preamble may not be required if the path is short enough. It's also possible that the original path may include a trailing backslash. Still, this is safe and certainly not excessive.

We can then resize the path to accommodate these characters as well as the appended path as follows, where size is the size of the path to append:

path.resize(path.size() + 5 + size);

We can then simply pass the standard string to the PathCchAppendEx function directly and avoid any further buffer calculations or copies:

check(PathCchAppendEx(&path[0],
                      path.size() + 1,
                      more,
                      PATHCCH_ALLOW_LONG_PATHS));

Keep in mind that that these functions expect the reported buffer size to include the null terminator, hence the "+ 1" in the statement above.

Of course, now we're in a position where we might have a standard string whose reported size is more than its actual size. For that, we can simply trim it down to size:

path.resize(wcslen(path.c_str()));

And that's about it. We might even include wrappers to handle appending raw character strings as well as standards strings. Here's a complete example of what this might look like:

struct path_exception
{
    HRESULT code;

    path_exception(HRESULT const result) :
        code { result }
    {}
};

auto check(HRESULT const result) -> void
{
    if (S_OK != result)
    {
        throw path_exception(result);
    }
}

namespace details
{
    auto path_append(std::wstring & path,
                     wchar_t const * more,
                     size_t const size) -> void
    {
        path.resize(path.size() + 5 + size); 

        check(PathCchAppendEx(&path[0],
                              path.size() + 1,
                              more,
                              0));

        path.resize(wcslen(path.c_str()));
    }
}

auto path_append(std::wstring & path,
                 wchar_t const * more) -> void
{
    details::path_append(path,
                         more,
                         wcslen(more));
}

auto path_append(std::wstring & path,
                 std::wstring const & more) -> void
{
    details::path_append(path,
                         more.c_str(),
                         more.size());
}

auto main() -> int
{
    try
    {
        auto path = std::wstring { L"C:\\" };

        path_append(path, L"MorePath");
        path_append(path, std::wstring { L"MorePath2" });

        TRACE(L"%s\n", path.c_str());
    }
    catch (path_exception const & e)
    {
        TRACE(L"%x\n", e.code);
    }
}

About the Author

Kenny Kerr is a computer programmer based in Canada, an author for Pluralsight and a Microsoft MVP. He blogs at kennykerr.ca and you can follow him on Twitter at twitter.com/kennykerr.

comments powered by Disqus

Featured

  • Microsoft Revamps Fledgling AutoGen Framework for Agentic AI

    Only at v0.4, Microsoft's AutoGen framework for agentic AI -- the hottest new trend in AI development -- has already undergone a complete revamp, going to an asynchronous, event-driven architecture.

  • IDE Irony: Coding Errors Cause 'Critical' Vulnerability in Visual Studio

    In a larger-than-normal Patch Tuesday, Microsoft warned of a "critical" vulnerability in Visual Studio that should be fixed immediately if automatic patching isn't enabled, ironically caused by coding errors.

  • Building Blazor Applications

    A trio of Blazor experts will conduct a full-day workshop for devs to learn everything about the tech a a March developer conference in Las Vegas keynoted by Microsoft execs and featuring many Microsoft devs.

  • Gradient Boosting Regression Using C#

    Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of the gradient boosting regression technique, where the goal is to predict a single numeric value. Compared to existing library implementations of gradient boosting regression, a from-scratch implementation allows much easier customization and integration with other .NET systems.

  • Microsoft Execs to Tackle AI and Cloud in Dev Conference Keynotes

    AI unsurprisingly is all over keynotes that Microsoft execs will helm to kick off the Visual Studio Live! developer conference in Las Vegas, March 10-14, which the company described as "a must-attend event."

Subscribe on YouTube