The spaceland
package¶
The spaceland
package — named after the three-dimensional world in Edwin Abbot’s book Flatland: A Romance of Many Dimensions — contains everything required to read ESRI shapefiles. It’s broken down into several core modules:
The spaceland.shp
module¶
Read non-topological geometric records from the ESRI Shapefile format.
The Shapefile format was documented by ESRI in 1998 and is available in a document titled ESRI Shapefile Technical Description.
-
class
spaceland.shp.
Shapefile
(shp: typing.IO[bytes]) → None¶ Read records from an ESRI shapefile.
A shapefile is a binary format created by ESRI in the early 1990s for storing non-topographical geometries. After a short header containing file metadata the geometries are stored in a sequence of individual records. The format is compact and fast to read but because it can’t contain indexes, details of the projection used, or metadata on individual shapes, it’s commonly accompanied by other files (e.g. a dBase III database for geometry metadata).
Class objects allow for iteration and can be used as context managers.
-
get_parse_function
()¶ Return a function capable of parsing a particular type of shape.
The function returned will be suitable for parsing shapefile records of one type (e.g. two-dimensional points). The type is defined in the header of the shapefile, and so the returned function will handle all non-null records within a single shapefile.
Return type: Callable
[[bytes
],tuple
]
-
records
()¶ Yield all geometric records in the shapefile, one-by-one.
Records are returned in file order. Records are returned as a tuple, with the structure of the tuple dependent on the shape type. The structure of each shape type’s tuple is detailed in the shape parsing functions:
parse_null_record()
for null shapesparse_point_record()
for two-dimensional points
The appropriate parsing function for a file can be found using
Shapefile.get_parse_function()
.Return type: Iterable
[tuple
]
-
-
class
spaceland.shp.
ShapefileMeta
(shape_type, x_min, y_min, x_max, y_max, z_min, z_max, m_min, m_max)¶ -
m_max
¶ Alias for field number 8
-
m_min
¶ Alias for field number 7
-
shape_type
¶ Alias for field number 0
-
x_max
¶ Alias for field number 3
-
x_min
¶ Alias for field number 1
-
y_max
¶ Alias for field number 4
-
y_min
¶ Alias for field number 2
-
z_max
¶ Alias for field number 6
-
z_min
¶ Alias for field number 5
-
-
spaceland.shp.
parse_null_record
(content)¶ Parse a null shape record from a shapefile.
A null shape is an empty record with no geometric data. It can be used as a shape type for a shapefile but it’s also valid as a placeholder in a shapefile of any other type. That is, a shapefile of polygons can also incude null shape records. This is the only valid way a shapefile can contain multiple shape types.
Parameters: content ( bytes
) – An empty byte stringReturn type: tuple
Returns: An empty tuple.
-
spaceland.shp.
parse_point_record
(content)¶ Parse a point shape record from a shapefile.
A point consists of a pair of double-precision coordinates ordered x, y.
Parameters: content ( bytes
) – 16 bytes containing two 64-bit IEEE double-precision floating-point numbers, in little-endian byte order.Return type: tuple
Returns: An tuple containing a point in x, y order.
The spaceland.dbf
module¶
Reads the subset of the dBase III file format used by ESRI shapefiles.
The dBase III format was never specified publicly but it has been reverse-engineered. The best documentation on the subject can be found at http://www.clicketyclick.dk/databases/xbase/format/dbf.html.
-
class
spaceland.dbf.
DbaseFile
(dbf: typing.IO[bytes], encoding: str = 'ascii') → None¶ Read fields and records from a dBase III binary file.
A dBase III file is a simple tabular data format consisting of a header, fields (columns), and records (rows). Fields are typed; as used in the ESRI shapefile format, the records in a dBase III file must have one of five field types: string, float, integer, date, or boolean. All types allow null values.
Class objects allow for iteration and slicing, and they also work as context managers.
-
record
(index)¶ Return the record at the given index.
Parameters: index ( int
) – The position of the record relative to the beginning of the file.Return type: tuple
Returns: A namedtuple, each item matching one field in the record.
-
records
(start=0)¶ Yield the records in the file.
A record is a set of fields and their values. The field names, types, and order are consistent across all records in the file.
It’s possible that a field has an invalid value (e.g. a non-numeric value in an integer field). When this happens the value becomes
None
and no error is raised.Parameters: start ( int
) – The record from which to start iteration. By default starts with the first record in the file.Yields: A namedtuple, each item matching one field in the record. Item names and order are consistent across records within the same file, but will differ between files. Return type: Iterable
[tuple
]
-
-
spaceland.dbf.
get_parse_str
(encoding)¶ Return a function that decodes bytes to strings.
The returned function decodes the bytes using the character encoding passed to this function.
>>> utf8 = get_parse_str("UTF-8") >>> utf8(b'\xf0\x9f\x91\x8d') '👍'
Parameters: encoding ( str
) – The name of a character encoding that can be used to decode the bytes to a string.Return type: Callable
[[bytes
],str
]Returns: A function that uses the given character encoding to convert bytes to strings.
-
spaceland.dbf.
parse_bool
(value)¶ Convert bytes to a boolean value.
Parameters: value ( bytes
) – A bytes value to be converted to a boolean value.Return type: Optional
[bool
]Returns: True
if the bytes value isY
,y
,T
, ort
;False
if the bytes value isN
,n
,F
, orf
;None
otherwise.
-
spaceland.dbf.
parse_date
(value)¶ Convert bytes in the format
YYYYMMDD
to a datetime.date object.Parameters: value ( bytes
) – A bytes value to be converted to a date.Return type: Optional
[date
]Returns: A datetime.date object if the bytes value is a valid date, but None
otherwise.
The spaceland.cli
module¶
Command-line interface to the library’s functionality.
This module provides the following functions that are registered as
‘console script’ entry points in setup.py
:
dbf_to_csv()
: convert dBase III files to CSVs (as commanddbfr
)
When the package is installed via setuptools (e.g. using
pip install
) the commands are immediately available to the user.
-
spaceland.cli.
dbf_to_csv
()¶ Read a dBase III file and convert it to a CSV.
Used as a ‘console script’ entry point in
setup.py
and available on the command-line asdbfr
. The dBase III file named as an argument is parsed and converted to CSV, and output tostdout
. The CSV dialect used can be configured using command-line options, as can the character-encoding used when reading the dBase file.Return type: None
-
spaceland.cli.
extant_file
(arg)¶ Type-check an argument to ensure it names an existing file.
Return type: Path