Crate npath[−][src]
Expand description
Normalized Paths
npath
is a Rust library providing methods for cross-platform lexical path processing and
normalization. These methods are implemented in extension traits to Path
and PathBuf
.
Usage
Add npath
to your Cargo.toml
:
[dependencies]
npath = { git = "https://github.com/gdzx/npath" }
Import the following traits:
use npath::{NormPathExt, NormPathBufExt};
Overview
std::path
lacks methods for lexical path processing, which:
- Do not rely on system calls.
- Remove the need to handle I/O errors.
- Support more operations.
- Allow to process paths to files or directories that do not exist.
The following sections outline the main features this library provides.
Joining paths
One of the most basic operation is joining two paths. Trying to get
C:\Users\User\Documents\C:\foo
using Path::join
can yield an entirely different path:
use std::path::Path;
assert_eq!(
Path::new(r"C:\Users\User\Documents").join(r"C:\foo"),
Path::new(r"C:\foo"),
);
Although paths are represented by strings, Path::join
is a high-level method that processes
its second argument to determine if it is absolute. On the contrary, the fundamental operation
of appending a path to another by string concatenation is called a lexical join.
NormPathExt::lexical_join
joins two paths with an operation similar to string
concatenation, only adding a path separator in-between if needed. Path::join
is a
refinement of a lexical join:
use std::path::{Path, PathBuf};
use npath::NormPathExt;
fn join(base: &Path, path: &Path) -> PathBuf {
if path.is_absolute() {
path.to_path_buf()
} else {
base.lexical_join(path)
}
}
Normalization
If you want to check whether two paths are identical, you need to transform them into a form
that allows comparison. Rust provides std::fs::canonicalize
, which returns the true
canonical path on the filesystem:
use std::path::Path;
assert_eq!(
Path::new("/srv").join("file.txt").canonicalize()?,
Path::new("/srv").join("bar//../file.txt").canonicalize()?,
);
Path::canonicalize
requires a concrete path (that refers to an existing file or directory
on the filesystem) or it will return an error. NormPathExt::normalized
eliminates the
intermediate components .
, ..
, or duplicate /
through pure lexical processing. It is the
building block for comparing paths, ensuring a path is restricted to some base path, or for
finding the relative path between two paths. It yields the shortest lexically equivalent path:
it is normalized.
NormPathExt::resolved
uses both approaches: the longest prefix whose individual components
exist is canonicalized, the remaining path is normalized, and adjoined to it. The purpose is to
circumvent the limitations of normalization, while still being able to apply it to paths that
do not exist.
Restricting paths
Web servers are exposed to path traversal vulnerabilities that allow an attacker to access
files outside of some base directory. Path::join
with the base directory /srv
and a
user-supplied path can yield a path outside of /srv
:
use std::path::{Path, PathBuf};
assert_eq!(
Path::new("/srv").join("/etc/passwd"),
PathBuf::from("/etc/passwd")
);
Only accepting relative paths is not sufficient:
use std::path::{Path, PathBuf};
use npath::NormPathExt;
assert_eq!(
Path::new("/srv").join("../etc/passwd").normalized(),
PathBuf::from("/etc/passwd")
);
Stripping ..
prefixes is not enough either:
use std::path::{Path, PathBuf};
use npath::NormPathExt;
assert_eq!(
Path::new("/srv").join("foo/../../etc/passwd").normalized(),
PathBuf::from("/etc/passwd") // /etc/passwd
);
If the user-provided path only needs to be a single path component, the programmer can forbid
any string containing paths separators and filter ..
. Otherwise, the inner ..
components
needs to be simplified, and the prefix ..
components eliminated. Normalization is at the core
of the following methods:
NormPathExt::is_inside
: checks if a path is a descendant of another.NormPathExt::rooted_join
: joins two paths, the result is restricted to the first one.
Limitations
Lexical path processing, being limited to operations without interacting with the system, can change the concrete object a path points to.
Normalization
If /a/b
is a symlink to /d/e
, then for /a/b/../c
:
Path::canonicalize
returns/d/c
if it exists, an I/O error otherwise.NormPathExt::normalized
returns/a/c
.NormPathExt::resolved
returns/d/c
, regardless of whether it exists or not.
Windows
Common Windows filesystems are case-insensitive, where foo.txt
, FOO.TXT
, and fOo.txT
point to the same file. Additionally, the mapping from lowercase to uppercase letters in the
Unicode range is stored in the filesystem, and depends on the date it was created on. This
library performs case-insensitive comparisons only for the ASCII character set (the first 128
Unicode characters).
TODO
- Special Windows prefixes.