Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 5F3C7200BF1 for ; Tue, 3 Jan 2017 19:51:36 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 5DD3F160B43; Tue, 3 Jan 2017 18:51:36 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id A7350160B20 for ; Tue, 3 Jan 2017 19:51:35 +0100 (CET) Received: (qmail 4525 invoked by uid 500); 3 Jan 2017 18:51:29 -0000 Mailing-List: contact reviews-help@mesos.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: reviews@mesos.apache.org Delivered-To: mailing list reviews@mesos.apache.org Received: (qmail 4481 invoked by uid 99); 3 Jan 2017 18:51:29 -0000 Received: from reviews-vm.apache.org (HELO reviews.apache.org) (140.211.11.40) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 Jan 2017 18:51:29 +0000 Received: from reviews.apache.org (localhost [127.0.0.1]) by reviews.apache.org (Postfix) with ESMTP id CAC79310BB8; Tue, 3 Jan 2017 18:51:28 +0000 (UTC) Content-Type: multipart/alternative; boundary="===============7911764083273344835==" MIME-Version: 1.0 Subject: Re: Review Request 54877: Windows: Stout: Removed dependency on Shell API. From: Andrew Schwartzmeyer To: Daniel Pravat , Joseph Wu , Alex Clemmer Cc: Mesos ReviewBot , mesos Date: Tue, 03 Jan 2017 18:51:28 -0000 Message-ID: <20170103185128.13569.82000@reviews.apache.org> X-ReviewBoard-URL: https://reviews.apache.org/ Auto-Submitted: auto-generated Sender: Andrew Schwartzmeyer X-ReviewGroup: mesos X-Auto-Response-Suppress: DR, RN, OOF, AutoReply X-ReviewRequest-URL: https://reviews.apache.org/r/54877/ X-Sender: Andrew Schwartzmeyer References: <20161220002928.7747.89057@reviews.apache.org> In-Reply-To: <20161220002928.7747.89057@reviews.apache.org> Reply-To: Andrew Schwartzmeyer X-ReviewRequest-Repository: mesos archived-at: Tue, 03 Jan 2017 18:51:36 -0000 --===============7911764083273344835== MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit > On Dec. 20, 2016, 12:29 a.m., Daniel Pravat wrote: > > 3rdparty/stout/include/stout/windows/os.hpp, line 748 > > > > > > I don't think the conversion to UTF-8 is appropiate here. > > Andrew Schwartzmeyer wrote: > What would you convert it to? It's currently in UTF-16, and Windows paths are allowed to have (almost) any Unicode character. > > Alex Clemmer wrote: > I think he's saying that bad things will happen if you use strings that have Unicode in them, so it's probably better to just error out. > > Andrew Schwartzmeyer wrote: > I'm not following... That is a case our codebase must deal with, NTFS paths are stored in Unicode: > > https://msdn.microsoft.com/en-us/library/windows/desktop/dd317748(v=vs.85).aspx > > and on top of that, Windows file paths can: > > > Use any character in the current code page for a name, including Unicode characters and characters in the extended character set (128–255) > > https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx > > So what I'm saying is that bad things _must not happen_ if we have paths with non-ASCII characters in them; we need to handle that at any point we deal with file paths on Windows. > > I'm not saying this function is likely to return a path like `C:\ÂÑÐ¥`, but that would be a valid file path, and so we must be treating all our paths on Windows as Unicode. > > If this is not the case, are purposefully constraining ourselves to ASCII, and if so, why? > > Alex Clemmer wrote: > Sorry, I meant bad things will happen to Mesos if you give it strings with unicode in them. Mesos itself does not deal with Unicode gracefully. > > Andrew Schwartzmeyer wrote: > Oooh. We should probably fix that. > > Andrew Schwartzmeyer wrote: > Alex, Daniel, do we have a resolution on this? I spoke with Alex and we feel that we should be doing two main things: avoid exacberating the problem in MESOS-6817 by being explicit in our use of `W` APIs, rather than relying on the mercurial value of `TCHAR`; and not attempt to guard against paths that Mesos does not handle, as it is a bug that Mesos does not handle paths with valid extended characters (since both Linux and Windows may have these paths), so failing here would be inappropriate as it hides the actual bug. (Not to speak for you Alex, I think this summarizes our discussion yesterday.) > > Alex Clemmer wrote: > If I'm understanding our discussion correctly, failing here on unicode conversion would be preferable, since it makes a subtle bug more obvious. If that's nontrivial I think we can just ship it as is, since Mesos already has these problems. > > Andrew Schwartzmeyer wrote: > > failing here on unicode conversion > > What would fail here? > > I *think* you're implying to add a detection algorithm to see if there are any characters in the path that Mesos does not believe are valid; but that is not "failing on Unicode", we'd need to write our own validator for Mesos. > > Alex Clemmer wrote: > I was just saying, if there's an easy way to try to interpret the string as normal ASCII and return `Error` if that fails, we should do that. If not, it's not that big of a deal, because Mesos will probably fail anyway. Ah I see what you want: > interpret the string as normal ASCII and return Error if that fails but I don't believe this is possible. To be clear, the non-Unicode versions of these APIs do not return ASCII; they return the same data encoded with the current Windows code page (which is a superset of ASCII, in much the same way as UTF-8). If we used this version, we would not only hit the same problem (possible non-ASCII characters), but also introduce a new problem of having to decode from an arbitrary Windows code page (rather than UTF-16 which is at least standard). With either the short or wide version of the API, detecting if the string is not ASCII would mean implmenenting a validator, which I don't think we want to do here (it's the wrong solution to the problem). That said, what would you suggest? P.S. For what it's worth, I don't expect this particular API to give us non-ASCII characters, but I think we should choose a pattern for dealing with Windows APIs and stick to it. This pattern gives us guaranteed UTF-8 strings with whatever data we're expected to handle. - Andrew ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/54877/#review159672 ----------------------------------------------------------- On Dec. 19, 2016, 11:20 p.m., Andrew Schwartzmeyer wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/54877/ > ----------------------------------------------------------- > > (Updated Dec. 19, 2016, 11:20 p.m.) > > > Review request for mesos, Daniel Pravat, Alex Clemmer, and Joseph Wu. > > > Repository: mesos > > > Description > ------- > > The API `SHGetKnownFolderPath` requires `Shell32.dll`, > which is not available on Nano server. > The equivalent API `GetAllUsersProfileDirectory` > only requires `Userenv.dll`, which is available on Nano. > > This API is also friendlier, as we own the allocation. > > The Unicode version `GetAllUsersProfileDirectoryW` is > explicitly used so that we are guaranteed a Unicode path, > which we then convert from UTF-16 to UTF-8, > instead of using the ASCII version which depends on a > varying Windows code-page, and is not recommended. > > A `vector` is used over a `wstring` to avoid dealing > with the placement of the null-terminating character. > > > Diffs > ----- > > 3rdparty/stout/include/stout/windows.hpp e641c46d033372e1b6c9f9c066b1ad4957d55088 > 3rdparty/stout/include/stout/windows/os.hpp 5cd92545a49648e39e8eb7cf131895e9cfc97902 > > Diff: https://reviews.apache.org/r/54877/diff/ > > > Testing > ------- > > cmake && msbuild, attach agent to master and check default `runtime_dir` value. > > > Thanks, > > Andrew Schwartzmeyer > > --===============7911764083273344835==--