
Product
Introducing Repository Access Permissions and Custom Roles
Socket now supports Custom Roles and Repository Access Permissions so organizations can control who can access specific repositories and actions.
pandas-maxminddb
Advanced tools
Provides fast and convenient geolocation bindings for Pandas Dataframes. Uses numpy ndarray's internally to speed it up compared to naively applying function per column. Based on the maxminddb-rust.
pip install pandas_maxminddbThe wheels are built against following numpy and pandas distributions:
--extra-index-url=https://www.piwheels.org/simple,
install libatlas-base-dev for numpy.--extra-index-url https://alpine-wheels.github.io/index
, install libstdc++ for pandas.Refer to the build workflow for details.
| Py | win x86 | win x64 | macOS x86_64 | macOS AArch64 | linux x86_64 | linux i686 | linux AArch64 | linux ARMv7 | musl linux x86_64 |
|---|---|---|---|---|---|---|---|---|---|
| 3.8 | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | 🚫 | ✅ |
| 3.9 | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | 🚫 |
| 3.10 | 🚫 | ✅ | ✅ | ✅ | ✅ | ✅ | 🚫 | 🚫 | ✅ |
By importing pandas_maxminddb you add Pandas geo extension which allows you to add columns
in-place. This example uses context manager for reader lifetime:
import pandas as pd
from pandas_maxminddb import open_database
ips = pd.DataFrame(data={
'ip': ["75.63.106.74", "132.206.246.203", "94.226.237.31", "128.119.189.49", "2.30.253.245"]})
with open_database('./GeoLite.mmdb/GeoLite2-City.mmdb') as reader:
ips.geo.geolocate('ip', reader, ['country', 'city', 'state', 'postcode'])
ips
| ip | city | postcode | state | country | |
|---|---|---|---|---|---|
| 0 | 75.63.106.74 | Houston | 77070 | TX | US |
| 1 | 132.206.246.203 | Montreal | H3A | QC | CA |
| 2 | 94.226.237.31 | Kapellen | 2950 | VLG | BE |
| 3 | 128.119.189.49 | Northampton | 01060 | MA | US |
| 4 | 2.30.253.245 | London | SW15 | ENG | GB |
You can also instantiate reader yourself, eg:
import pandas as pd
from pandas_maxminddb import ReaderMem, ReaderMmap
reader = ReaderMem('./GeoLite.mmdb/GeoLite2-City.mmdb')
ips = pd.DataFrame(data={
'ip': ["75.63.106.74", "132.206.246.203", "94.226.237.31", "128.119.189.49", "2.30.253.245"]})
ips.geo.geolocate('ip', reader, ['country', 'city', 'state', 'postcode'])
ips
If dataset is big enough, and you have extra cores you might benefit from using them. Currently only ReaderMem is supported:
import pandas as pd
from pandas_maxminddb import ReaderMem
reader = ReaderMem('./GeoLite.mmdb/GeoLite2-City.mmdb')
ips = pd.DataFrame(data={
'ip': ["75.63.106.74", "132.206.246.203", "94.226.237.31", "128.119.189.49", "2.30.253.245"]})
ips.geo.geolocate('ip', reader, ['country', 'city', 'state', 'postcode'], parallel=True)
ips
| Name (time in ms) | Min | Max | Mean | StdDev | Median | IQR | Outliers | OPS | Rounds | Iterations |
|---|---|---|---|---|---|---|---|---|---|---|
| test_benchmark_pandas_parallel_mem_maxminddb | 52.7588 (1.0) | 57.4206 (1.0) | 54.0573 (1.0) | 1.1782 (1.15) | 53.8497 (1.0) | 1.4194 (1.09) | 4;1 | 18.4989 (1.0) | 20 | 1 |
| test_benchmark_pandas_mmap_maxminddb | 240.0050 (4.55) | 244.3257 (4.26) | 242.2177 (4.48) | 1.9017 (1.85) | 243.1021 (4.51) | 3.2122 (2.46) | 2;0 | 4.1285 (0.22) | 5 | 1 |
| test_benchmark_pandas_mem_maxminddb | 241.4630 (4.58) | 244.2553 (4.25) | 242.8391 (4.49) | 1.0288 (1.0) | 242.7672 (4.51) | 1.3064 (1.0) | 2;0 | 4.1180 (0.22) | 5 | 1 |
| test_benchmark_c_maxminddb | 1,010.6569 (19.16) | 1,055.1080 (18.38) | 1,021.3691 (18.89) | 18.9273 (18.40) | 1,013.3819 (18.82) | 12.9544 (9.92) | 1;1 | 0.9791 (0.05) | 5 | 1 |
| test_benchmark_python_maxminddb | 9,021.2686 (170.99) | 9,188.7629 (160.03) | 9,071.0055 (167.80) | 70.0512 (68.09) | 9,039.7811 (167.87) | 84.7766 (64.89) | 1;0 | 0.1102 (0.01) | 5 | 1 |
Due to Dataframe columns being flat arrays and geolocation data coming in a hierarchical format you might need to provide more mappings to serve your particular use-case. In order to do that follow Development section to setup your environment and then:
git clone --recurse-submodules git@github.com:andrusha/pandas-maxminddb.gitPYTHON_CONFIGURE_OPTS="--enable-shared" asdf installPYTHON_CONFIGURE_OPTS="--enable-shared" python -m venv .venvsource .venv/bin/activatepip install noxnox -s testPYTHONPATH=.venv/lib/python3.8/site-packages cargo test --no-default-featuresIn order to run nox -s bench properly you would
need libmaxminddb installed as
per maxminddb instructions prior to
installing Python package, so that C-extension could be benchmarked properly.
On macOS this would require following:
brew instal libmaxminddbPATH="/opt/homebrew/Cellar/libmaxminddb/1.7.1/bin:$PATH" LDFLAGS="-L/opt/homebrew/Cellar/libmaxminddb/1.7.1/lib" CPPFLAGS="-I/opt/homebrew/Cellar/libmaxminddb/1.7.1/include" pip install maxminddb --force-reinstall --verbose --no-cache-dirFAQs
Fast geolocation library for Pandas Dataframes, built on Numpy C-FFI
The pypi package pandas-maxminddb receives a total of 69 weekly downloads. As such, pandas-maxminddb popularity was classified as not popular.
We found that pandas-maxminddb demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Product
Socket now supports Custom Roles and Repository Access Permissions so organizations can control who can access specific repositories and actions.

Product
Socket MCP now lets AI assistants review org alerts, investigate threats using the Socket threat feed, and inspect package files in addition to dependency scoring.

Product
Socket Firewall blocks malicious VS Code and Open VSX extensions before install, protecting developers from compromised editor marketplaces.