
Security News
Next.js Patches Critical Middleware Vulnerability (CVE-2025-29927)
Next.js has patched a critical vulnerability (CVE-2025-29927) that allowed attackers to bypass middleware-based authorization checks in self-hosted apps.
PDAL Python support allows you to process data with PDAL into Numpy
_ arrays.
It provides a PDAL extension module to control Python interaction with PDAL.
Additionally, you can use it to fetch schema
_ and metadata
_ from PDAL operations.
Note The PDAL Python bindings require the PDAL base library installed. Source code can be found at https://pdal.io and GitHub <https://github.com/PDAL/PDAL>
__.
PyPI ................................................................................
PDAL Python support is installable via PyPI:
.. code-block::
pip install PDAL
GitHub ................................................................................
The repository for PDAL's Python extension is available at https://github.com/PDAL/python
Python support released independently from PDAL itself as of PDAL 1.7.
Simple ................................................................................
Given the following pipeline, which simply reads an ASPRS LAS
_ file and
sorts it by the X
dimension:
.. _ASPRS LAS
: https://www.asprs.org/committee-general/laser-las-file-format-exchange-activities.html
.. code-block:: python
json = """
{
"pipeline": [
"1.2-with-color.las",
{
"type": "filters.sort",
"dimension": "X"
}
]
}"""
import pdal
pipeline = pdal.Pipeline(json)
count = pipeline.execute()
arrays = pipeline.arrays
metadata = pipeline.metadata
log = pipeline.log
Programmatic Pipeline Construction ................................................................................
The previous example specified the pipeline as a JSON string. Alternatively, a
pipeline can be constructed by creating Stage
instances and piping them
together. For example, the previous pipeline can be specified as:
.. code-block:: python
pipeline = pdal.Reader("1.2-with-color.las") | pdal.Filter.sort(dimension="X")
A stage is an instance of pdal.Reader
, pdal.Filter
or pdal.Writer
.
A stage can be instantiated by passing as keyword arguments the options
applicable to the respective PDAL stage. For more on PDAL stages and their
options, check the PDAL documentation on Stage Objects <https://pdal.io/pipeline.html#stage-objects>
__.
filename
option of Readers
and Writers
as well as the type
option of Filters
can be passed positionally as the first argument.inputs
option specifies a sequence of stages to be set as input to the
current stage. Each input can be either the string tag of another stage, or
the Stage
instance itself.The Reader
, Filter
and Writer
classes come with static methods for
all the respective PDAL drivers. For example, pdal.Filter.head()
is a
shortcut for pdal.Filter(type="filters.head")
. These methods are
auto-generated by introspecting pdal
and the available options are
included in each method's docstring:
.. code-block::
>>> help(pdal.Filter.head)
Help on function head in module pdal.pipeline:
head(**kwargs)
Return N points from beginning of the point cloud.
user_data: User JSON
log: Debug output filename
option_file: File from which to read additional options
where: Expression describing points to be passed to this filter
where_merge='auto': If 'where' option is set, describes how skipped points should be merged with kept points in standard mode.
count='10': Number of points to return from beginning. If 'invert' is true, number of points to drop from the beginning.
invert='false': If true, 'count' specifies the number of points to skip from the beginning.
A pdal.Pipeline
instance can be created from:
a JSON string: Pipeline(json_string)
a sequence of Stage
instances: Pipeline([stage1, stage2])
a single Stage
with the Stage.pipeline
method: stage.pipeline()
nothing: Pipeline()
creates a pipeline with no stages.
joining Stage
and/or other Pipeline
instances together with the pipe
operator (|
):
stage1 | stage2
stage1 | pipeline1
pipeline1 | stage1
pipeline1 | pipeline2
Every application of the pipe operator creates a new Pipeline
instance. To
update an existing Pipeline
use the respective in-place pipe operator (|=
):
.. code-block:: python
# update pipeline in-place
pipeline = pdal.Pipeline()
pipeline |= stage
pipeline |= pipeline2
Reading using Numpy Arrays ................................................................................
The following more complex scenario demonstrates the full cycling between PDAL and Python:
TileDB-PDAL integration
_ using the TileDB writer plugin
_.. code-block:: python
import pdal
data = "https://github.com/PDAL/PDAL/blob/master/test/data/las/1.2-with-color.las?raw=true"
pipeline = pdal.Reader.las(filename=data).pipeline()
print(pipeline.execute()) # 1065 points
# Get the data from the first array
# [array([(637012.24, 849028.31, 431.66, 143, 1,
# 1, 1, 0, 1, -9., 132, 7326, 245380.78254963, 68, 77, 88),
# dtype=[('X', '<f8'), ('Y', '<f8'), ('Z', '<f8'), ('Intensity', '<u2'),
# ('ReturnNumber', 'u1'), ('NumberOfReturns', 'u1'), ('ScanDirectionFlag', 'u1'),
# ('EdgeOfFlightLine', 'u1'), ('Classification', 'u1'), ('ScanAngleRank', '<f4'),
# ('UserData', 'u1'), ('PointSourceId', '<u2'),
# ('GpsTime', '<f8'), ('Red', '<u2'), ('Green', '<u2'), ('Blue', '<u2')])
arr = pipeline.arrays[0]
# Filter out entries that have intensity < 50
intensity = arr[arr["Intensity"] > 30]
print(len(intensity)) # 704 points
# Now use pdal to clamp points that have intensity 100 <= v < 300
pipeline = pdal.Filter.range(limits="Intensity[100:300)").pipeline(intensity)
print(pipeline.execute()) # 387 points
clamped = pipeline.arrays[0]
# Write our intensity data to a LAS file and a TileDB array. For TileDB it is
# recommended to use Hilbert ordering by default with geospatial point cloud data,
# which requires specifying a domain extent. This can be determined automatically
# from a stats filter that computes statistics about each dimension (min, max, etc.).
pipeline = pdal.Writer.las(
filename="clamped.las",
offset_x="auto",
offset_y="auto",
offset_z="auto",
scale_x=0.01,
scale_y=0.01,
scale_z=0.01,
).pipeline(clamped)
pipeline |= pdal.Filter.stats() | pdal.Writer.tiledb(array_name="clamped")
print(pipeline.execute()) # 387 points
# Dump the TileDB array schema
import tiledb
with tiledb.open("clamped") as a:
print(a.schema)
Executing Streamable Pipelines
................................................................................
Streamable pipelines (pipelines that consist exclusively of streamable PDAL
stages) can be executed in streaming mode via Pipeline.iterator()
. This
returns an iterator object that yields Numpy arrays of up to chunk_size
size
(default=10000) at a time.
.. code-block:: python
import pdal
pipeline = pdal.Reader("test/data/autzen-utm.las") | pdal.Filter.range(limits="Intensity[80:120)")
for array in pipeline.iterator(chunk_size=500):
print(len(array))
# or to concatenate all arrays into one
# full_array = np.concatenate(list(pipeline))
Pipeline.iterator()
also takes an optional prefetch
parameter (default=0)
to allow prefetching up to to this number of arrays in parallel and buffering
them until they are yielded to the caller.
If you just want to execute a streamable pipeline in streaming mode and don't
need to access the data points (typically when the pipeline has Writer stage(s)),
you can use the Pipeline.execute_streaming(chunk_size)
method instead. This
is functionally equivalent to sum(map(len, pipeline.iterator(chunk_size)))
but more efficient as it avoids allocating and filling any arrays in memory.
Accessing Mesh Data ................................................................................
Some PDAL stages (for instance filters.delaunay
) create TIN type mesh data.
This data can be accessed in Python using the Pipeline.meshes
property, which returns a numpy.ndarray
of shape (1,n) where n is the number of Triangles in the mesh.
If the PointView contains no mesh data, then n = 0.
Each Triangle is a tuple (A,B,C)
where A, B and C are indices into the PointView identifying the point that is the vertex for the Triangle.
Meshio Integration ................................................................................
The meshes property provides the face data but is not easy to use as a mesh. Therefore, we have provided optional Integration
into the Meshio <https://github.com/nschloe/meshio>
__ library.
The pdal.Pipeline
class provides the get_meshio(idx: int) -> meshio.Mesh
method. This
method creates a Mesh
object from the PointView
array and mesh properties.
.. note:: The meshio integration requires that meshio is installed (e.g. pip install meshio
). If it is not, then the method fails with an informative RuntimeError.
Simple use of the functionality could be as follows:
.. code-block:: python
import pdal
...
pl = pdal.Pipeline(pipeline)
pl.execute()
mesh = pl.get_meshio(0)
mesh.write('test.obj')
Advanced Mesh Use Case ................................................................................
USE-CASE : Take a LiDAR map, create a mesh from the ground points, split into tiles and store the tiles in PostGIS.
.. note:: Like Pipeline.arrays
, Pipeline.meshes
returns a list of numpy.ndarray
to provide for the case where the output from a Pipeline is multiple PointViews
(example using 1.2-with-color.las and not doing the ground classification for clarity)
.. code-block:: python
import pdal
import psycopg2
import io
pl = (
pdal.Reader(".../python/test/data/1.2-with-color.las")
| pdal.Filter.splitter(length=1000)
| pdal.Filter.delaunay()
)
pl.execute()
conn = psycopg(%CONNNECTION_STRING%)
buffer = io.StringIO
for idx in range(len(pl.meshes)):
m = pl.get_meshio(idx)
if m:
m.write(buffer, file_format = "wkt")
with conn.cursor() as curr:
curr.execute(
"INSERT INTO %table-name% (mesh) VALUES (ST_GeomFromEWKT(%(ewkt)s)",
{ "ewkt": buffer.getvalue()}
)
conn.commit()
conn.close()
buffer.close()
.. _Numpy
: http://www.numpy.org/
.. _schema
: http://www.pdal.io/dimensions.html
.. _metadata
: http://www.pdal.io/development/metadata.html
.. _TileDB
: https://tiledb.com/
.. _TileDB-PDAL integration
: https://docs.tiledb.com/geospatial/pdal
.. _TileDB writer plugin
: https://pdal.io/stages/writers.tiledb.html
.. image:: https://github.com/PDAL/python/workflows/Build/badge.svg :target: https://github.com/PDAL/python/actions?query=workflow%3ABuild
pip install pybind11[global]
)pip install numpy
)pip install scikit-build-core
)FAQs
Point cloud data processing
We found that pdal demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 2 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Next.js has patched a critical vulnerability (CVE-2025-29927) that allowed attackers to bypass middleware-based authorization checks in self-hosted apps.
Security News
A survey of 500 cybersecurity pros reveals high pay isn't enough—lack of growth and flexibility is driving attrition and risking organizational security.
Product
Socket, the leader in open source security, is now available on Google Cloud Marketplace for simplified procurement and enhanced protection against supply chain attacks.