New Case Study:See how Anthropic automated 95% of dependency reviews with Socket.Learn More
Socket
Sign inDemoInstall
Socket

@apache-arrow/es5-cjs

Package Overview
Dependencies
Maintainers
6
Versions
47
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

@apache-arrow/es5-cjs - npm Package Versions

1235

0.16.0

Diff

Changelog

Source

Apache Arrow 0.16.0 (2020-02-07)

Bug Fixes

  • ARROW-3783 - [R] Incorrect collection of float type
  • ARROW-3962 - [Go] Handle null values in CSV
  • ARROW-4470 - [Python] Pyarrow using considerable more memory when reading partitioned Parquet file
  • ARROW-4998 - [R] R package fails to install on OSX
  • ARROW-5575 - [C++] Split Targets.cmake for each module
  • ARROW-5655 - [Python] Table.from_pydict/from_arrays not using types in specified schema correctly
  • ARROW-5680 - [Rust][DataFusion] GROUP BY sql tests are now deterministic
  • ARROW-6157 - [C++] Array data validation
  • ARROW-6195 - [C++] Detect Apache mirror without Python
  • ARROW-6298 - [Rust] [CI] Examples are not being tested in CI
  • ARROW-6320 - [C++] Arrow utilities are linked statically
  • ARROW-6429 - [CI][Crossbow] Nightly spark integration job fails
  • ARROW-6445 - [CI][Crossbow] Nightly Gandiva jar trusty job fails
  • ARROW-6567 - [Rust][DataFusion] Wrap aggregate in projection when needed
  • ARROW-6581 - [C++] Fix fuzzit job submission
  • ARROW-6704 - [C++] Check for out of bounds timestamp in unsafe cast
  • ARROW-6708 - [C++] Fix hardcoded boost library names
  • ARROW-6728 - [C#] Support reading and writing Date32 and Date64 arrays
  • ARROW-6736 - [Rust][DataFusion] Evaluate the input to the aggregate expression just once per batch
  • ARROW-6740 - [C++] Unmap MemoryMappedFile as soon as possible
  • ARROW-6745 - [Rust] Fix a variety of minor typos.
  • ARROW-6749 - [Python] Let Array.to_numpy use general conversion code with zero_copy_only=True
  • ARROW-6750 - [Python] Silence S3 error logs by default
  • ARROW-6761 - [Rust] Travis build now uses the correct Rust toolchain
  • ARROW-6762 - [C++] Support reading JSON files with no newline at end
  • ARROW-6785 - [JS] Remove superfluous child assignment
  • ARROW-6786 - [C++] arrow-dataset-file-parquet-test is slow
  • ARROW-6795 - [C#] Fix for reading large (2GB+) files
  • ARROW-6798 - [CI] [Rust] Improve build times by caching dependencies in the Docker image
  • ARROW-6801 - [Rust] Arrow source release tarball is missing benchmarks
  • ARROW-6806 - [C++][Python] Fix crash validating an IPC-originating empty array
  • ARROW-6808 - [Ruby] Ensure requiring suitable MSYS2 package
  • ARROW-6809 - [RUBY] Gem does not install on macOS due to glib2 3.3.7 compilation failure
  • ARROW-6812 - [Java] Fix License header
  • ARROW-6813 - [Ruby] Arrow::Table.load with headers=true leads to exception in Arrow 0.15
  • ARROW-6820 - [Format] Update Map type child to "entries"
  • ARROW-6834 - [C++][TRIAGE] Pin gtest version 1.8.1 to unblock Appveyor builds
  • ARROW-6835 - [Archery][CMake] Restore ARROW_LINT_ONLY cmake option
  • ARROW-6842 - [Website] Jekyll error building website
  • ARROW-6844 - [C++][Parquet] Fix regression in reading List types with item name that is not "item"
  • ARROW-6846 - [C++] Build failures with glog enabled
  • ARROW-6857 - [C++] Fix DictionaryEncode for zero-chunk ChunkedArray
  • ARROW-6859 - [CI][Nightly] Disable docker layer caching for CircleCI tasks
  • ARROW-6860 - [Python][C++] Do not link shared libraries monolithically to pyarrow.lib, add libarrow_python_flight.so
  • ARROW-6861 - [C++] Fix length/null_count/capacity accounting through Reset and AppendIndices in DictionaryBuilder
  • ARROW-6864 - [C++] Add compression-related compile definitions before adding any unit tests
  • ARROW-6867 - [FlightRPC][Java] clean up default executor
  • ARROW-6868 - [Go] Fix slicing struct arrays
  • ARROW-6869 - [C++] Do not return invalid arrays from DictionaryBuilder::Finish when reusing builder. Add "FinishDelta" method and "ResetFull" method
  • ARROW-6873 - [Python] Remove stale CColumn references
  • ARROW-6874 - [Python] Fix memory leak when converting to Pandas object data
  • ARROW-6876 - [C++][Parquet] Use shared_ptr to avoid copying ReaderContext struct, fix performance regression with reading many columns
  • ARROW-6877 - [C++] Add additional Boost versions to support 1.71 and the presumed next 2 future versions
  • ARROW-6878 - [Python] Fix creating array from list of dicts with bytes keys
  • ARROW-6882 - [C++] Ensure the DictionaryArray indices has no dictionary data
  • ARROW-6885 - [Python] Remove superfluous skipped timedelta test
  • ARROW-6886 - [C++] Fix arrow::io nvcc compiler warnings
  • ARROW-6898 - [Java][hotfix] fix ArrowWriter memory leak
  • ARROW-6898 - [Java] Fix potential memory leak in ArrowWriter and several test classes
  • ARROW-6899 - [Python] Decode dictionary-encoded List children to dense when converting to pandas
  • ARROW-6901 - [Rust][Parquet] Increment total_num_rows when closing a row group
  • ARROW-6903 - [Python] Attempt to fix Python wheels with introduction of libarrow_python_flight, disabling of pyarrow.orc
  • ARROW-6905 - [Gandiva][Crossbow] Use xcode9.4 for osx builds, do not build dataset, filesystem
  • ARROW-6910 - [C++][Python] Set jemalloc default configuration to release dirty pages more aggressively back to the OS dirty_decay_ms and muzzy_decay_ms to 0 by default, add C++ / Python option to configure this
  • ARROW-6913 - [R] Potential bug in compute.cc
  • ARROW-6914 - [CI] docker-clang-format nightly failing
  • ARROW-6922 - [Python] Compat with pandas for MultiIndex.levels.names
  • ARROW-6925 - [C++] Only add -stdlib flag on MacOS when using clang.
  • ARROW-6929 - [C++] Remove first offset==0 check from Validate()
  • ARROW-6937 - [Packaging][Python] Fix conda linux and OSX wheel nightly builds
  • ARROW-6938 - [Packaging][Python] Disable bz2 in Windows wheels and build ZSTD in bundled mode to triage linking issues
  • ARROW-6948 - [Rust][Parquet] Fix boolean array in arrow reader.
  • ARROW-6950 - [C++][Dataset] Add dataset benchmark example
  • ARROW-6957 - [CI][Crossbow] Nightly R with sanitizers build fails installing dependencies
  • ARROW-6962 - [C++][CI] Stop compiling with -Weverything
  • ARROW-6966 - [Go] Set a default memset for when the platform doesn't set one
  • ARROW-6977 - [C++] Disable jemalloc background_thread on macOS
  • ARROW-6983 - [C++] Fix ThreadedTaskGroup lifetime issue
  • ARROW-6989 - [Python] Check for out of range precision decimals in python conversion
  • ARROW-6992 - [C++] : Undefined Behavior sanitizer build option fails with GCC
  • ARROW-6999 - [Python] Fix unnamed index when specifying schema in Table.from_pandas
  • ARROW-7013 - [C++] arrow-dataset pkgconfig is incomplete
  • ARROW-7020 - [Java] Fix the bugs when calculating vector hash code
  • ARROW-7021 - [Java] UnionFixedSizeListWriter decimal type should check writer index
  • ARROW-7022 - , ARROW-7023: [Python] fix handling of pandas Index and Period/Interval extension arrays in pa.array
  • ARROW-7023 - [Python] pa.array does not use "from_pandas" semantics for pd.Index
  • ARROW-7024 - [CI][R] Update R dependencies for Conda build
  • ARROW-7027 - [Python] Correctly raise error in pa.table(..) on invalid input
  • ARROW-7033 - [C++] Set SDKROOT automatically on macOS
  • ARROW-7045 - [R] Preserve factor in Parquet roundtrip
  • ARROW-7050 - [R] Fix compiler warnings in R bindings
  • ARROW-7053 - [Python] setuptools-scm produces incorrect version at apache-arrow-0.15.1 tag
  • ARROW-7056 - [Python] Fix test_fs failures when S3 not enabled
  • ARROW-7059 - [C++][Parquet] Mostly fix performance regression when reading Parquet file with many columns
  • ARROW-7074 - [C++] ASSERT_OK_AND_ASSIGN should use ASSERT_OK instead of EXPE…
  • ARROW-7077 - [C++] Casting dictionary to unrelated value type shouldn't crash
  • ARROW-7087 - [Python] Metadata disappear from pandas dataset
  • ARROW-7097 - [Rust][CI] Apply rustfmt nightly
  • ARROW-7100 - [C++][HDFS] Fix search directories for libjvm.so
  • ARROW-7105 - [CI][Crossbow] Nightly homebrew-cpp job fails
  • ARROW-7106 - [Java] Fix the problem that flight perf test hangs endlessly
  • ARROW-7117 - [C++][CI] Fix the hanging C++ tests in Windows 2019
  • ARROW-7128 - [CI] Use proper version for fedora tests in GitHub actions cron jobs
  • ARROW-7133 - [CI] Allow GH Actions to run on all branches
  • ARROW-7142 - [C++] GCC compilation failures in nightlies
  • ARROW-7152 - [Java] Delete useless class DiffFunction
  • ARROW-7157 - [R] Add validation, helpful error message to Object$new()
  • ARROW-7158 - [C++] Use compiler information provided by CMake
  • ARROW-7163 - [Doc] Fix double-and typos
  • ARROW-7164 - [CI] Dev cron github action is failing every 15 minutes
  • ARROW-7167 - [CI][Python] Add nightly tests for additional pandas versions to Github Actions
  • ARROW-7168 - [Python] Respect the specified dictionary type for pd.Categorical conversion
  • ARROW-7170 - [C++] Fix linking with bundled ORC
  • ARROW-7180 - [CI] Java builds are not triggered on the master branch
  • ARROW-7181 - [C++] Fix an Arrow module search bug with pkg-config
  • ARROW-7183 - [CI][Crossbow] Re-skip r-sanitizer nightly tests
  • ARROW-7187 - [C++][Doc] doxygen broken on master because of @
  • ARROW-7188 - [C++][Doc] doxygen broken on master: missing param implicit_casts
  • ARROW-7189 - [CI][Crossbow] Nightly conda osx builds fail
  • ARROW-7194 - [Rust] Fix CSV writer recursion issues
  • ARROW-7199 - [Java] Fix ConcurrentModificationException in BaseAllocator::getChildAllocators
  • ARROW-7200 - [C++][Flight] Enable the server to serve to remote clients
  • ARROW-7209 - [Python] Fix tests on pandas master related to extension dtype conversion
  • ARROW-7212 - [Go] add missing Release to benchmark code
  • ARROW-7214 - [Python] Fix pickling of DictionaryArray
  • ARROW-7217 - [CI][Python] Use correct python version in Github Actions
  • ARROW-7225 - [C++] Fix *std::move(Result<T>) for move-only T
  • ARROW-7249 - [CI] Release test fails in master due to new arrow-flight Rust crate
  • ARROW-7250 - [C++] Define constexpr symbols explicitly in StringToFloatConverter::Impl
  • ARROW-7253 - [CI] Fix failure in release test
  • ARROW-7254 - [Java] BaseVariableWidthVector#setSafe appears to make value offsets inconsistent
  • ARROW-7264 - [Java] RangeEqualsVisitor type check is not correct
  • ARROW-7266 - [C++] Fix ArrayDataVisitor on sliced binary-like array
  • ARROW-7271 - [C++][Flight] Use the single parameter version of SetTotalBytesLimit
  • ARROW-7281 - [C++] Make Adaptive builders' length match expectations
  • ARROW-7282 - [Python] IO functions should raise the right exceptions
  • ARROW-7291 - [Dev] Fix FORMAT_DIR
  • ARROW-7294 - [Python] converted_type_name_from_enum(): Incorrect name for INT_64
  • ARROW-7295 - [R] Fix bad test that causes failure on R < 3.5
  • ARROW-7298 - [C++] Fix thirdparty dependency downloader script
  • ARROW-7314 - [Python] Fix compiler warning in pyarrow.union
  • ARROW-7318 - [C#] TimestampArray serialization failure
  • ARROW-7320 - [C++] Specify CMAKE_INSTALL_LIBDIR for gbenchmark
  • ARROW-7327 - [CI] Failing C GLib and R buildbot builders
  • ARROW-7328 - [CI] GitHub Actions should trigger on changes to GitHub Actions configuration
  • ARROW-7341 - [CI] Unbreak nightly Conda R job
  • ARROW-7343 - [Java][FlightRPC] prevent leak in DoGet
  • ARROW-7349 - [C++] Fix the bug of parsing string hex values
  • ARROW-7353 - [C++] Ignore -Wmissing-braces when building with clang
  • ARROW-7354 - [C++] Fix crash in test-io-hdfs
  • ARROW-7355 - [CI] Environment variables are defined twice for the fuzzit builds
  • ARROW-7358 - [CI] [Dev] [C++] ccache disabled on conda-python-hdfs
  • ARROW-7359 - [C++][Gandiva] Don't throw error for locate function for start position exceeding string length
  • ARROW-7360 - [R] Can't use dplyr filter() with variables defined in parent scope
  • ARROW-7361 - [Rust] Build directory is not passed to ci/scripts/rust_test.sh
  • ARROW-7362 - [Python][C++] Added ListArray.Flatten() that properly flattens a ListArray
  • ARROW-7374 - [Dev][C++] Fix cuda-cpp docker build
  • ARROW-7381 - [C++] Unbreak manylinux1 wheels after Iterator refactor
  • ARROW-7386 - [C#] Array offset does not work properly
  • ARROW-7388 - [Python] Skip HDFS tests if libhdfs cannot be located
  • ARROW-7389 - [Python][Packaging] Remove pyarrow.s3fs import check from the recipe
  • ARROW-7393 - [Plasma] Fix plasma executable name in plasma_java build
  • ARROW-7395 - [C++] Do not warn or error on logical "or" with constants
  • ARROW-7397 - [C++][JSON] Fix white space length detection error
  • ARROW-7404 - [C++][Gandiva] Fix utf8 char length error on Arm64
  • ARROW-7406 - [Java] NonNullableStructVector#hashCode should pass hasher to child vectors
  • ARROW-7407 - [Python] Declare NumPy a PEP517 build dependency
  • ARROW-7408 - [C++] Fix compilation of reference benchmarks
  • ARROW-7435 - [C++] Validate all list / binary offsets in ValidateFull()
  • ARROW-7436 - [Archery] Enable more benchmark binaries in archery benchmark
  • ARROW-7437 - [Java] ReadChannel#readFully does not set writer index correctly
  • ARROW-7442 - [Ruby] Add abstract type check to Arrow::DataType.resolve
  • ARROW-7447 - [Java] ComplexCopier does incorrect copy in some cases
  • ARROW-7450 - [C++] Also link boost_filesystem when using static test linkage
  • ARROW-7458 - [GLib] Fix incorrect build dependency in Makefile
  • ARROW-7471 - [CI][Python] Run flake8 on Cython files
  • ARROW-7472 - [Java] Fix some incorrect behavior in UnionListWriter
  • ARROW-7478 - [Rust][DataFusion] Group by expression ignored unless paired with aggregate expression
  • ARROW-7492 - [CI][Crossbow] Nightly homebrew-cpp job fails on Python installation
  • ARROW-7497 - [Python] Stop relying on (deprecated) pandas.util.testing, move to pandas.testing
  • ARROW-7500 - [C++][Dataset] Remove std::regex usage
  • ARROW-7503 - [Rust][Parquet] Fix build failures
  • ARROW-7506 - [Java] JMH benchmarks should be called from main methods
  • ARROW-7508 - [C#] DateTime32 Reading is Broken
  • ARROW-7510 - [C++] Make ArrayData::null_count thread-safe
  • ARROW-7516 - [C#] Fix .NET Benchmarks
  • ARROW-7518 - [Python] Use PYARROW_WITH_HDFS when building wheels, conda packages
  • ARROW-7527 - [Python] Fix pandas/feather tests for unsupported types with pandas master
  • ARROW-7528 - [Python] Remove usage of deprecated pd.np and pd.datetime in tests
  • ARROW-7535 - [C++] Fix ASAN failures in Array::Validate()
  • ARROW-7543 - [R] Fixes R arrow::write_parquet() documentation code examples
  • ARROW-7545 - [C++] [Dataset] Scanning dataset with dictionary type hangs
  • ARROW-7551 - [FlightRPC][C++] Flight test on macOS fails due to Homebrew gRPC
  • ARROW-7552 - [C++][CI] Disable timing-sensitive tests on public CI
  • ARROW-7554 - [C++] Add support for building on FreeBSD
  • ARROW-7559 - [Rust] Incorrect index check assertion in StringArray and BinaryArray
  • ARROW-7561 - [Doc][Python] Add missing conda_env_gandiva.yml in python.rst
  • ARROW-7563 - [Rust] failed to select a version for `byteorder`
  • ARROW-7582 - [Rust][Flight] Unable to compile arrow.flight.protocol.rs
  • ARROW-7583 - [FlightRPC][C++] relax auth tests due to nondeterminism
  • ARROW-7591 - [Python] Fix DictionaryArray.to_numpy() to return decoded numpy array
  • ARROW-7592 - [C++] Fix crashes on corrupt IPC input
  • ARROW-7593 - [CI][Python] Python datasets failing / not run on CI
  • ARROW-7595 - [R][CI] R appveyor job fails due to pacman compression change
  • ARROW-7596 - [Python] Only permit zero-copy DataFrame block construction when split_blocks=True
  • ARROW-7599 - [Java] Fix build break due to change in RangeEqualsVisitor
  • ARROW-7603 - [Packaging][RPM] Add workaround for LLVM on CentOS 8
  • ARROW-7611 - [Packaging][Python] Fix artifacts patterns for wheel
  • ARROW-7612 - [Packaging][Python] Fix artifacts path for Conda on Windows
  • ARROW-7614 - [Python] Limit size of data in test_parquet.py::test_set_data_page_size
  • ARROW-7618 - [C++] Fix crashes or undefined behaviour on corrupt IPC input
  • ARROW-7620 - [Rust] Remove call to flatc
  • ARROW-7621 - [Doc] Fix doc build
  • ARROW-7634 - [Python] Run pyarrow.dataset tests on Appveyor + fix failures to parse Windows file paths
  • ARROW-7638 - [C++][Dataset] Fix a segfault in DirectoryPartitioningFactory
  • ARROW-7639 - [R] Cannot convert Dictionary Array to R when values aren't strings
  • ARROW-7640 - [C++][Dataset][Parquet] Detect missing compression support
  • ARROW-7647 - [C++] Repair JSON parser's handling of ListArrays
  • ARROW-7650 - [C++][Dataset] enable dataset tests on Windows
  • ARROW-7651 - [CI][Crossbow] Nightly macOS wheel builds fail
  • ARROW-7652 - [Python][Dataset] Use implicit cast in ScannerBuilder.filter
  • ARROW-7661 - [Python] Test for optimal CSV chunking
  • ARROW-7689 - [FlightRPC][C++] bump bundled gRPC to 1.25 to fix MacOS test failure
  • ARROW-7690 - [R] Cannot write parquet to OutputStream
  • ARROW-7693 - [CI] Fix test name for Spark integration, add new tests
  • ARROW-7709 - [Python] Preserve column name in conversion from Table column to pandas for non-ns timestamps
  • ARROW-7714 - [Release] Add missing variable expansion
  • ARROW-7718 - [Release] Fix auto-retry in the binary release script
  • ARROW-7723 - [Python] Triage untested functional regression when converting tz-aware timestamp inside struct to pandas/NumPy format
  • ARROW-7727 - [Python] Unable to read a ParquetDataset when schema validation is on.
  • ARROW-8135 - [Python] Problem importing PyArrow on a cluster
  • ARROW-8638 - Arrow Cython API Usage Gives an error when calling CTable API Endpoints
  • PARQUET-1692 - [C++] Don't use the same CMake variable name for thirdparty version and found version
  • PARQUET-1692 - [C++] LogicalType::FromThrift error on Centos 7 RPM
  • PARQUET-1693 - [C++] Fix parquet examples with compression define guards
  • PARQUET-1702 - [C++] Make BufferedRowGroupWriter compatible with parquet encryption
  • PARQUET-1706 - [C++] Wrong dictionary_page_offset when writing only data pages via BufferedPageWriter
  • PARQUET-1707 - [C++] : parquet-arrow-test fails with UBSAN
  • PARQUET-1709 - [C++] Avoid unnecessary temporary std::shared_ptr copies
  • PARQUET-1715 - [C++] Add the Parquet code samples to CI + Refactor Parquet Encryption Samples
  • PARQUET-1720 - [C++] JSONPrint not showing version correctly
  • PARQUET-1747 - [C++] Access to ColumnChunkMetaData fails when encryption is on
  • PARQUET-1766 - [C++] Handle parquet::Statistics NaNs and -0.0f as per upstream parquet-mr
  • PARQUET-1772 - [C++] ParquetFileWriter: Data overwritten in append mode

New Features and Improvements

  • ARROW-412 - [Format][Documentation] Clarify that Buffer.size in Flatbuffers should reflect the actual memory size rather than the padded size
  • ARROW-501 - [C++] Implement concurrent / buffering InputStream for streaming data use cases
  • ARROW-772 - [C++] Implement take kernel functions
  • ARROW-843 - [C++][Dataset] Ensure Schemas are unified in DataSourceDiscovery
  • ARROW-976 - [C++][Python] Provide API for defining and reading Parquet datasets with more ad hoc partition schemes
  • ARROW-1036 - [C++] Define abstract API for filtering Arrow streams (e.g. predicate evaluation)
  • ARROW-1119 - [Python/C++] Implement NativeFile interfaces for Amazon S3
  • ARROW-1175 - [Java] Implement/test dictionary-encoded subfields
  • ARROW-1456 - [Python] Run s3fs unit tests in Travis CI
  • ARROW-1562 - [C++] Numeric kernel implementations for add
  • ARROW-1638 - [Java] IPC roundtrip for null type
  • ARROW-1900 - [C++] Add kernel for min / max
  • ARROW-2428 - [Python] Support pandas ExtensionArray in Table.to_pandas conversion
  • ARROW-2602 - [Packaging] Automate build of development docker containers
  • ARROW-2863 - [Python] Add context manager APIs to RecordBatch*Writer/Reader classes
  • ARROW-3085 - [Rust] Add an adapter for parquet.
  • ARROW-3408 - [C++] Add CSV option to automatically attempt dict encoding
  • ARROW-3444 - [Python] Add Array/ChunkedArray/Table nbytes attribute
  • ARROW-3706 - [Rust] Add record batch reader trait.
  • ARROW-3789 - [Python] Use common conversion path for Arrow to pandas.Series/DataFrame. Zero copy optimizations for DataFrame, add split_blocks and self_destruct options
  • ARROW-3808 - [R] Array extract, including Take method
  • ARROW-3813 - [R] lower level construction of Dictionary Arrays
  • ARROW-4059 - [Rust] Parquet/Arrow Integration
  • ARROW-4091 - [C++] Curate default list of CSV null spellings
  • ARROW-4208 - [CI/Python] Have automatized tests for S3
  • ARROW-4219 - [Rust][Parquet] Initial support for arrow reader.
  • ARROW-4223 - [Python] Support scipy.sparse integration
  • ARROW-4224 - [Python] Support integration with pydata/sparse library
  • ARROW-4225 - [Format][C++] Add CSC sparse matrix support
  • ARROW-4722 - [C++] Implement Bitmap class to modularize handling of bitmaps
  • ARROW-4748 - [Rust][DataFusion] Optimize GROUP BY aggregate queries
  • ARROW-4930 - [C++] Improve find_package() support
  • ARROW-5180 - [Rust] IPC Support
  • ARROW-5181 - [Rust] Initial support for Arrow File reader
  • ARROW-5182 - [Rust] Arrow IPC file writer
  • ARROW-5227 - [Rust] [DataFusion] Re-implement query execution with an extensible physical query plan
  • ARROW-5277 - [C#] MemoryAllocator.Allocate(length: 0) doesn't return null
  • ARROW-5333 - [C++] Clamp build option summary width to 90
  • ARROW-5366 - [Rust] Duration and Interval Arrays
  • ARROW-5400 - [Rust] Test/ensure that reader and writer support zero-length record batches
  • ARROW-5445 - [Website] Remove language that encourages pinning a version
  • ARROW-5454 - [C++] Implement Take on ChunkedArray for DataFrame use
  • ARROW-5502 - [R] file readers should mmap
  • ARROW-5508 - [C++] Create reusable Iterator<T> interface
  • ARROW-5523 - [Python][Packaging] Use HTTPS consistently for downloading wheel dependencies
  • ARROW-5712 - [C++][Parquet] Arrow time32/time64/timestamp ConvertedType not being restored properly
  • ARROW-5767 - [Format] Permit dictionary replacements in IPC protocol
  • ARROW-5801 - [CI] Dockerize (add to docker-compose) all Travis CI Linux tasks
  • ARROW-5802 - [CI][Archery] Dockerify lint utilities
  • ARROW-5804 - [C++] Dockerize C++ CI job with conda-forge toolchain, code coverage from Travis CI
  • ARROW-5805 - [Python] Dockerize (add to docker-compose) Python Travis CI job
  • ARROW-5806 - [CI] Dockerize (add to docker-compose) Integration tests Travis CI entry
  • ARROW-5807 - [JS] Dockerize NodeJS Travis CI entry
  • ARROW-5808 - [GLib][Ruby] Dockerize (add to docker-compose) current GLib + Ruby Travis CI entry
  • ARROW-5809 - [CI][Rust] Travis runs dockerized Rust build
  • ARROW-5810 - [Go] Dockerize Travis CI Go build
  • ARROW-5831 - [Release] Add Python program to download binary artifacts in parallel, allow abort/resume
  • ARROW-5839 - [Python] Test manylinux2010 in CI
  • ARROW-5855 - [Python] Support for Duration (timedelta) type
  • ARROW-5859 - [Python] Support ExtensionArray.to_numpy using storage array
  • ARROW-5971 - [Website] Blog post introducing Arrow Flight
  • ARROW-5994 - [CI] [Rust] Create nightly releases of the Rust implementation
  • ARROW-6003 - [C++] Better input validation and error messaging in CSV reader
  • ARROW-6074 - [FlightRPC][Java] Middleware
  • ARROW-6091 - [Rust][DataFusion] Implement physical execution plan for LIMIT
  • ARROW-6109 - [Integration] Docker image for integration testing can't be built on windows
  • ARROW-6112 - [Java] Support int64 buffer lengths in Java
  • ARROW-6184 - [Java] Provide hash table based dictionary encoder
  • ARROW-6251 - [Developer] Add PR merge tool to apache/arrow-site
  • ARROW-6257 - [C++] Add fnmatch compatible globbing function
  • ARROW-6274 - [Rust][DataFusion] Add support for writing results to CSV
  • ARROW-6277 - [C++][Parquet] Support direct DictionaryArray write of all parquet types
  • ARROW-6283 - [Rust][DataFusion] Implement Context::write_csv to write partitioned CSV results
  • ARROW-6285 - [GLib] Add support for LargeBinary and LargeString types
  • ARROW-6286 - [GLib] Add support for LargeList type
  • ARROW-6299 - [C++] Simplify FileFormat classes to singletons
  • ARROW-6321 - [Python] Ability to create ExtensionBlock on conversion to pandas
  • ARROW-6340 - [R] Implements low-level bindings to Dataset classes
  • ARROW-6341 - [Python] Implement low-level bindings for Dataset
  • ARROW-6352 - [Java] Add implementation of DenseUnionVector
  • ARROW-6367 - [C++][Gandiva] Implement string reverse
  • ARROW-6378 - [C++][Dataset] Implement recursive TreeDataSource
  • ARROW-6386 - [C++][Documentation] Explicit documentation of null slot interpretation
  • ARROW-6394 - [Java] Support conversions between delta vector and partial sum vector
  • ARROW-6396 - [C++] Add overloads of Boolean kernels implementing Kleene logic
  • ARROW-6398 - [C++] Consolidate ScanOptions and ScanContext
  • ARROW-6405 - [Python] Add std::move wrapper for use in Cython
  • ARROW-6452 - [Java] Override ValueVector toString() method
  • ARROW-6463 - [C++][Python] Rename arrow::fs::Selector to FileSelector
  • ARROW-6466 - [Integration][CI] Move integration test code to archery integration command. Dockerize integration tests
  • ARROW-6468 - [C++] Remove unused hashing routines
  • ARROW-6473 - Dictionary encoding format clarifications/future proofing
  • ARROW-6503 - [C++] Add an argument of memory pool object to SparseTensorConverter
  • ARROW-6508 - [C++] Add Tensor and SparseTensor factory function with validations
  • ARROW-6515 - [C++] Clean type_traits.h definitions
  • ARROW-6578 - [C++] Allow casting number to string
  • ARROW-6592 - [Java] Add support for skipping decoding of columns/field in Avro converter
  • ARROW-6594 - [Java] Support logical type encodings from Avro
  • ARROW-6598 - [Java] Sort the code for ApproxEqualsVisitor and provide an interface for custom vector equality
  • ARROW-6608 - [C++] Make default for ARROW_HDFS to be OFF
  • ARROW-6610 - [C++] Add cmake option to disable filesystem layer
  • ARROW-6611 - [C++] Make ARROW_JSON=OFF the default
  • ARROW-6612 - [C++] Add ARROW_CSV CMake build flag
  • ARROW-6619 - [Ruby] Add support for building Gandiva::Expression by Arrow::Schema#build_expression
  • ARROW-6624 - [C++][Python] Add SparseTensor.ToTensor() method
  • ARROW-6625 - [C++][Python] Allow concat_tables to null fill missing columns
  • ARROW-6631 - [C++] Do not build any compression libraries by default in C++ build
  • ARROW-6632 - [C++] Do not build with ARROW_COMPUTE=on and ARROW_DATASET=on by default
  • ARROW-6633 - [C++] Vendor double-conversion library
  • ARROW-6634 - [C++][FOLLOWUP] Remove Flatbuffers EP remnants from C++ Dockerfiles
  • ARROW-6634 - [C++] Vendor Flatbuffers and check in compiled sources
  • ARROW-6635 - [C++] Disable glog integration by default
  • ARROW-6636 - [C++] Do not build command line tools by default
  • ARROW-6637 - [Packaging][FOLLOWUP] Enable necessary components in Autobrew build for R
  • ARROW-6637 - [C++] Further streamline default build, add ARROW_CSV CMake option
  • ARROW-6646 - [Go] Write no IPC buffer metadata for NullType
  • ARROW-6650 - [Rust][Integration] Compare integration JSON with schema & batch
  • ARROW-6656 - [Rust][Datafusion] Add MAX, MIN expressions
  • ARROW-6657 - [Rust][DataFusion] Add Count Aggregate Expression
  • ARROW-6658 - [Rust][Datafusion] Implement AVG expression
  • ARROW-6659 - [Rust][DataFusion] Refactor of HashAggregateExec to support custom merge
  • ARROW-6662 - [Java] Implement equals/approxEquals API for VectorSchemaRoot
  • ARROW-6671 - [C++][Python] Use more consistent names for sparse tensor items
  • ARROW-6672 - [Java] Extract a common interface for dictionary builders
  • ARROW-6685 - [C++] Ignore trailing slashes in S3FS
  • ARROW-6686 - [CI] Pull and push docker images to speed up the nightly builds
  • ARROW-6688 - [Packaging] Include s3 support in the conda packages
  • ARROW-6690 - [Rust][DataFusion] Optimize aggregates without GROUP BY to use SIMD
  • ARROW-6692 - [Rust][DataFusion] Update examples to use physical query plan
  • ARROW-6693 - [Rust] [DataFusion] Update unit tests to use physical query plan
  • ARROW-6694 - [Rust][DataFusion] Integration tests now use physical query plan
  • ARROW-6695 - [Rust][DataFusion] Remove legacy code for executing logical plan
  • ARROW-6696 - [Rust][DataFusion] Implement simple math operations in physical query plan
  • ARROW-6700 - [Rust][DataFusion] Use new Arrow Parquet reader
  • ARROW-6707 - [Java] Improve the performance of JDBC adapters by using nullable information
  • ARROW-6710 - [Java] Add JDBC adapter test to cover cases which contains some null values
  • ARROW-6711 - [C++] Consolidate Filter and Expression
  • ARROW-6721 - [JAVA] Avro adapter benchmark only runs once in JMH
  • ARROW-6722 - [Java] Provide a uniform way to get vector name
  • ARROW-6729 - [C++] Prevent data copying in StlStringBuffer
  • ARROW-6730 - [CI] Use GitHub Actions for "C++ with clang 7" docker image
  • ARROW-6731 - [CI] [Rust] Set up Github Action to run Rust tests
  • ARROW-6732 - [Java] Implement quick sort in a non-recursive way to avoid stack overflow
  • ARROW-6741 - [Release] Update changelog.py to use APACHE_ prefixed JIRA_USERNAME and JIRA_PASSWORD environment variables
  • ARROW-6742 - [C++] Remove boost::filesystem dependency in hdfs_internal.cc
  • ARROW-6743 - [C++] Remove usage of boost::filesystem
  • ARROW-6744 - [Rust] Publicly expose JsonEqual
  • ARROW-6754 - [C++] Merge allocator.h into stl.h
  • ARROW-6758 - [Developer] Install local NodeJS via nvm when running release verification
  • ARROW-6764 - [C++] Create a readahead iterator
  • ARROW-6767 - [JS] Lazily bind batches in scan/scanReverse
  • ARROW-6768 - [C++][Dataset] Add method to convert from Scanner to Table
  • ARROW-6769 - [Dataset][C++] End to end test
  • ARROW-6770 - [CI][Travis] Download Minio quietly
  • ARROW-6777 - [GLib][CI] Unpin gobject-introspection gem
  • ARROW-6778 - [C++] Support cast for DurationType
  • ARROW-6782 - [C++] Do not require Boost for minimal C++ build
  • ARROW-6784 - [C++][R] Move filter and take for ChunkedArray, RecordBatch, and Table from Rcpp to C++ library
  • ARROW-6787 - [CI][C++] Decommission "C++ with clang 7 and system packages" Travis CI job
  • ARROW-6788 - [CI][Dev] Exercise merge script tests
  • ARROW-6789 - [Python] Improve ergonomics by automatically boxing Action and Result in do_action RPC
  • ARROW-6790 - [Release] Enable selected integration tests in release verification
  • ARROW-6793 - [R] Arrow C++ binary packaging for Linux
  • ARROW-6797 - [Release] Use a separately cloned arrow-site repository in the website post release script
  • ARROW-6802 - [Packaging][deb][RPM] Update qemu-user-static package URL
  • ARROW-6803 - [Rust][DataFusion] Performance optimization for single partition aggregate queries
  • ARROW-6804 - [CI][Rust] Migrate Travis job to Github Actions
  • ARROW-6807 - [Java][FlightRPC] Expose gRPC service & client
  • ARROW-6810 - [Website] Add docs for R package 0.15 release
  • ARROW-6811 - [R] Assorted post-0.15 release cleanups
  • ARROW-6814 - [C++] Resolve compiler warnings occurred on release build
  • ARROW-6822 - [Website] merge_pr.py is published
  • ARROW-6824 - [Plasma] Allow creation of multiple objects through a single IPC in Plasma Store
  • ARROW-6825 - [C++] Rework CSV reader IO around readahead iterator
  • ARROW-6831 - [R] Update R macOS/Windows builds for change in cmake compression defaults
  • ARROW-6832 - [R] Implement Codec::IsAvailable
  • ARROW-6833 - [R][CI] Add crossbow job for full R autobrew macOS build
  • ARROW-6836 - [Format][KeyValue] field to the Footer table in File.fbs
  • ARROW-6843 - [Website] Disable deploy on pull request
  • ARROW-6847 - [C++] Add range_expression adapter to Iterator
  • ARROW-6850 - [Java] Jdbc converter support Null type
  • ARROW-6852 - [C++] Fix build issue on memory-benchmark
  • ARROW-6853 - [Java] Support vector and dictionary encoder use different hasher for calculating hashCode
  • ARROW-6855 - [FlightRPC][C++][Python] Flight middleware for C++/Python
  • ARROW-6862 - [Developer] Check pull request title
  • ARROW-6863 - [Java] Provide parallel searcher
  • ARROW-6865 - [Java] Improve the performance of comparing an ArrowBuf against a byte array
  • ARROW-6866 - [Java] Improve the performance of calculating hash code for struct vector
  • ARROW-6879 - [Rust] Add explicit SIMD for sum kernel
  • ARROW-6880 - [Rust] Add explicit SIMD for min/max kernel
  • ARROW-6881 - [Rust] Remove "array_ops" in favor of the "compute" sub-module
  • ARROW-6884 - [Python] Format friendlier message in Python when a server-side RPC handler fails
  • ARROW-6887 - [Java] Create prose documentation for using ValueVectors
  • ARROW-6888 - [Java] Support copy operation for vector value comparators
  • ARROW-6889 - [Java] ComplexCopier enable FixedSizeList type & fix RangeEqualsVisitor StackOverFlow
  • ARROW-6891 - [Rust][Parquet] utf8 support for arrow reader.
  • ARROW-6902 - [C++][Compute] Add String/Binary support to Compare kernel
  • ARROW-6904 - [Python] Add support for MapArray
  • ARROW-6907 - [Plasma] Allow Plasma to send batched notifications.
  • ARROW-6911 - [Java] Provide composite comparator
  • ARROW-6912 - [Java] Extract a common base class for avro converter consumers
  • ARROW-6916 - [Developer] Sort tasks by name in Crossbow e-mail report
  • ARROW-6918 - [R] Make docker-compose setup faster
  • ARROW-6919 - [Python] Expose more builders in Cython
  • ARROW-6920 - [Packaging] Build python 3.8 wheels
  • ARROW-6926 - [Python] Support sizeof protocol for Python objects
  • ARROW-6927 - [C++] Add gRPC version check
  • ARROW-6928 - [Rust] Add support for FixedSizeListArray
  • ARROW-6930 - [Java] Create utility class for populating vector values used for test purpose only
  • ARROW-6932 - [JAVA] incorrect log on known extension type
  • ARROW-6933 - [Java] Suppor linear dictionary encoder
  • ARROW-6936 - [Python] Improve error message when unwrapping object fails
  • ARROW-6942 - [Developer] Add support for Parquet in pull request check by GitHub Actions
  • ARROW-6943 - [Website] Translate Apache Arrow Flight introduction to Japanese
  • ARROW-6944 - [Rust] Add String, FixedSizeBinary types
  • ARROW-6949 - [Java] Fix promotable writer to handle nullvectors
  • ARROW-6951 - [C++][Dataset] Column projection in ParquetFragment
  • ARROW-6952 - [C++][Dataset] Implement predicate pushdown with ParqueFileFragment
  • ARROW-6954 - [Python][CI] Add Python 3.8 to CI matrix
  • ARROW-6960 - [R] Add lz4 and zstd to R PKGBUILD
  • ARROW-6961 - [C++][Gandiva] Add string lower function in Gandiva
  • ARROW-6963 - [Packaging][Wheel][OSX] Use crossbow's command to deploy artifacts from travis builds
  • ARROW-6964 - [C++][Dataset] Add multithread support to Scanner::ToTable
  • ARROW-6965 - [C++][Dataset] Optionally expose partition keys as columns
  • ARROW-6967 - [C++][Dataset] IN, IS_VALID filter expressions
  • ARROW-6969 - [C++][Dataset] ParquetScanTask defer memory usage
  • ARROW-6970 - [Packaging][RPM] Add support for CentOS 8
  • ARROW-6973 - [C++][ThreadPool] Use perfect forwarding in Submit
  • ARROW-6975 - [C++] Put make_unique in its own header
  • ARROW-6980 - [R] dplyr backend for RecordBatch/Table
  • ARROW-6984 - [C++] Update LZ4 to 1.9.2 for CVE-2019-17543
  • ARROW-6986 - [R] Add basic Expression class
  • ARROW-6987 - [CI] Travis OSX failing to install sdk headers
  • ARROW-6991 - [Packaging][deb] Add support for Ubuntu 19.10
  • ARROW-6994 - [C++] Fix aggressive RSS inflation on macOS when jemalloc background_thread is not enabled
  • ARROW-6997 - [Packaging][RPM] Add apache-arrow-release
  • ARROW-7000 - [C++][Gandiva] Handle empty inputs in string upper, lower functions
  • ARROW-7003 - [Rust] Generate flatbuffers files in docker build image
  • ARROW-7004 - [Plasma] Make it possible to bump up object in LRU cache
  • ARROW-7006 - [Rust] Bump flatbuffers version to avoid vulnerability
  • ARROW-7007 - [C++] Add use_mmap option to LocalFS
  • ARROW-7014 - [Developer][Release] Add "wheels" verification option to verify-release-candidate.sh for Linux and macOS
  • ARROW-7015 - [Developer] Write script to verify macOS wheels given local environment with conda or virtualenv
  • ARROW-7016 - [Developer][Python] Add Windows batch script to test Python wheels for release candidate
  • ARROW-7019 - [Java] Improve the performance of loading validity buffers
  • ARROW-7026 - [Java] Remove assertions in MessageSerializer/vector/writer/reader
  • ARROW-7031 - [Python] Correct LargeListArray.offsets attribute
  • ARROW-7031 - [Python] Expose the offsets of a ListArray in python
  • ARROW-7032 - [Release] Run the python unit tests in the release verification script
  • ARROW-7034 - [CI][Crossbow] Skip known nightly failures
  • ARROW-7035 - [R] Default arguments are unclear in write_parquet docs
  • ARROW-7036 - [C++] Version up ORC to avoid compile errors
  • ARROW-7037 - [C++ ] Compile error on the combination of protobuf >= 3.9 and clang
  • ARROW-7039 - [Python] Fix pa.table/record_batch typecheck to work without pandas
  • ARROW-7047 - [C++] Insert implicit casts in ScannerBuilder::Finish
  • ARROW-7052 - [C++] Fix linking of datasets example when ARROW_BUILD_SHARED=OFF
  • ARROW-7054 - [Docs] Enable overriding project version with environment variable when building Sphinx docs
  • ARROW-7057 - [C++] Add API to parse URI query strings
  • ARROW-7058 - [C++] FileSystemDataSourceDiscovery should apply partition schemes relative to its base dir
  • ARROW-7060 - [R] Post-0.15.1 cleanup
  • ARROW-7061 - [C++][Dataset] Add ignore file options to FileSystemDataSourceDiscovery
  • ARROW-7062 - [C++][Dataset] Ensure ParquetFileFormat::Open catch parqu…
  • ARROW-7064 - [R] Support null type using vctrs::unspecified()
  • ARROW-7066 - [Python] Allow returning ChunkedArray in arrow_array
  • ARROW-7067 - [CI] Disable code coverage on Travis-CI
  • ARROW-7069 - [C++][Dataset] Replace ConstantPartitionScheme with PrefixDictionaryPartitionScheme
  • ARROW-7070 - [Packaging][deb] Update package names for 1.0.0
  • ARROW-7072 - [Java] Support concating validity bits efficiently
  • ARROW-7082 - [Packaging][deb] Add apache-arrow-archive-keyring package
  • ARROW-7086 - [C++] Provide a wrapper for invoking factories to produce a Result
  • ARROW-7092 - [R] Add vignette for dplyr and datasets
  • ARROW-7093 - [R] Support creating ScalarExpressions for more data types
  • ARROW-7094 - [C++] FileSystemDataSource should use an owning pointer for fs::Filesystem
  • ARROW-7095 - [R] Require an explicit call to pull Datasets into memory
  • ARROW-7096 - [C++] Unified ConcatenateTables APIs
  • ARROW-7098 - [Java] Improve the performance of comparing two memory blocks
  • ARROW-7099 - [C++] Disambiguate function calls in csv parser test
  • ARROW-7101 - [CI] Refactor docker-compose setup and use it with GitHub Actions
  • ARROW-7103 - [R] Various minor cleanups
  • ARROW-7107 - [C++][MinGW] Enable Flight on AppVeyor
  • ARROW-7110 - [GLib] Add filter support for GArrowTable, GArrowChunkedArray, and GArrowRecordBatch
  • ARROW-7111 - [GLib] Add take support for GArrowTable, GArrowChunkedArray, and GArrowRecordBatch
  • ARROW-7113 - [Rust] Add unowned buffer.
  • ARROW-7116 - [CI] Use the docker repository provided by apache organization
  • ARROW-7120 - [C++][CI] Add .ccache to the docker-compose volume mounts
  • ARROW-7146 - [R][CI] Various fixes and speedups for the R docker-compose setup
  • ARROW-7147 - [C++][Dataset] Refactor dataset's API to use Result<T>
  • ARROW-7148 - [C++][Dataset] Major API cleanup
  • ARROW-7149 - [C++] Remove experimental status on filesystem APIs
  • ARROW-7155 - [Java][CI] add maven wrapper to make setup process simple
  • ARROW-7159 - [CI] Run HDFS tests as cron task
  • ARROW-7160 - [C++] Update string_view backport
  • ARROW-7161 - [C++] Migrate filesystem APIs from Status to Result
  • ARROW-7162 - [C++] Cleanup warnings in cmake_modules/SetupCxxFlags.cmake
  • ARROW-7166 - [Java] Remove redundant code for Jdbc adapters
  • ARROW-7169 - [C++] Vendor uriparser library
  • ARROW-7171 - [Ruby] Pass Array<Boolean> for Arrow::Table#filter
  • ARROW-7172 - [C++][Dataset] Improve format of Expression::ToString
  • ARROW-7176 - [C++] Fix arrow::ipc compiler warning
  • ARROW-7178 - [C++] Vendor forward compatible std::optional
  • ARROW-7185 - [R][Dataset] Add bindings for IN, IS_VALID expressions
  • ARROW-7186 - [R] Add inline comments to document the dplyr code
  • ARROW-7192 - [Rust] Implement Flight crate
  • ARROW-7193 - [Rust] Arrow stream reader
  • ARROW-7195 - [Ruby] Improve #filter, #take, and #is_in
  • ARROW-7196 - [Ruby] Remove needless BinaryArrayBuilder#append_values
  • ARROW-7197 - [Ruby] Suppress keyword argument related warnings with Ruby 2.7
  • ARROW-7204 - [C++][Dataset] Implicit cast support for InExpression
  • ARROW-7206 - [Java] Avoid string concatenation when calling Preconditions#checkArgument
  • ARROW-7207 - [Rust] Update generated fbs files
  • ARROW-7210 - [C++][R] Allow Numeric <-> Temporal Scalar casts
  • ARROW-7211 - [Rust] Support byte buffers as a parquet sink
  • ARROW-7216 - [Java] Improve the performance of setting/clearing individual bits
  • ARROW-7219 - [Python][CI] Test with pickle5 installed
  • ARROW-7227 - [Python] Added a python wrapper for ConcatenateTablesWithPromotions
  • ARROW-7228 - [Python] Added a python wrapper for RecordBatch.FromStructArray()
  • ARROW-7235 - [C++] Add Result<T> APIs to IO layer
  • ARROW-7236 - [C++] Add Result<T> APIs to arrow/csv
  • ARROW-7240 - [C++] Add Result<T> to APIs to arrow/util
  • ARROW-7246 - [CI][Python] Use Python 3 for docker-compose
  • ARROW-7247 - [CI][Python] Fix wheel build error on macOS
  • ARROW-7248 - [Rust] Automatically Generate IPC Messages
  • ARROW-7255 - [CI] Re-enable source release test on pull request
  • ARROW-7257 - [CI] Fix Homebrew formula audit error by openssl
  • ARROW-7258 - [CI] Fix fuzzit build directory
  • ARROW-7259 - [Java] Support subfield encoder use different hasher
  • ARROW-7260 - [CI] Remove Ubuntu 14.04 test job
  • ARROW-7261 - [Python] Add Python support for Fixed Size List type
  • ARROW-7262 - [C++][Gandiva] Added replace function
  • ARROW-7263 - [C++][Gandiva] Implemented locate function
  • ARROW-7268 - [Rust] Add custom_metadata field from IPC message to Schema.
  • ARROW-7269 - [Python] Add ORC to api documentation
  • ARROW-7270 - [Go] preserve CSV reading behaviour, improve memory usage
  • ARROW-7274 - [C++] Add Result<T> APIs to Decimal class
  • ARROW-7275 - [Ruby] Add support for Arrow::ListDataType.new(data_type)
  • ARROW-7276 - [Ruby][...]
  • ARROW-7277 - [Java][Doc] Add discussion about vector lifecycle
  • ARROW-7279 - [C++] Rename UnionArray::type_ids to type_codes
  • ARROW-7284 - [Java] ensure java implementation meets clarified dictionary spec
  • ARROW-7289 - [C#] ListType constructor argument is redundant
  • ARROW-7290 - [C#] Implement ListArray Builder
  • ARROW-7292 - [CI][C++] Add ASAN / UBSAN run
  • ARROW-7293 - [Dev][C++] Persist ccache in docker-compose build volumes
  • ARROW-7296 - [Python] Add ORC api documentation
  • ARROW-7299 - [GLib] Use Result instead of Status
  • ARROW-7303 - [C++] Refactor CSV benchmarks to use Result APIs
  • ARROW-7306 - [C++] Add Result-returning version of FileSystemFromUri
  • ARROW-7307 - [CI][GLib] Ensure generating documentation
  • ARROW-7309 - [Python] Support HDFS federation viewfs
  • ARROW-7310 - [Python] Expose HDFS implementation for pyarrow.fs
  • ARROW-7311 - [Python] Return filesystem and path from URI
  • ARROW-7312 - [Rust] Implement std::error::Error for ArrowError.
  • ARROW-7317 - [C++] Migrate Iterator to a Result API
  • ARROW-7319 - [C++] Refactor Iterator<T> to yield Result<T>
  • ARROW-7321 - [CI][GLib] Disable development mode
  • ARROW-7322 - [CI][Python] Fall back to arrowdev dockerhub organization for manylinux images
  • ARROW-7323 - [CI][Rust] Use the same toolchain
  • ARROW-7324 - [Rust] Add timezone to timestamp
  • ARROW-7325 - [Rust][Parquet] Update to parquet-format 2.6 and thrift 0.12
  • ARROW-7329 - [Java] AllocationManager: Allow managing different types …
  • ARROW-7333 - [CI][Rust] Remove duplicated nightly job
  • ARROW-7334 - [CI][Python] Use Python 3 on macOS
  • ARROW-7339 - [CMake] Thrift version not respected in CMake configuration version.txt
  • ARROW-7340 - [CI] Prune defunct appveyor build setup
  • ARROW-7344 - [Packaging][Python] Build manylinux2014 wheels
  • ARROW-7346 - [CI] Explicit usage of ccache across the builds
  • ARROW-7347 - [C++] Update bundled Boost to 1.71.0
  • ARROW-7348 - [Rust] Add api to return null bitmap buffer.
  • ARROW-7351 - [Developer] Only suggest cpp-* versions by default for PARQUET issues in merge tool
  • ARROW-7357 - [Go] migrate to x/xerrors
  • ARROW-7366 - [C++][Dataset] Use PartitionSchemeDiscovery in DataSourceDiscovery
  • ARROW-7367 - [Python] Use np.full instead of np.array.repeat in ParquetDatasetPiece
  • ARROW-7368 - [Ruby] Use :arrow_file and :arrow_streaming for format name
  • ARROW-7369 - [GLib] Add garrow_table_combine_chunks
  • ARROW-7370 - [C++] Fix old Protobuf with AUTO detection failure
  • ARROW-7377 - [C++][Dataset] Add ScanOptions::MaterializedFields
  • ARROW-7378 - [C++][Gandiva] Fix loop vectorization in gandiva
  • ARROW-7379 - [C++] Introduce SchemaBuilder companion class and Field::IsCompatibleWith
  • ARROW-7380 - [C++][Dataset] Implement DatasetFactory
  • ARROW-7382 - [C++][Dataset] Insert missing directories in FileSystemDataSourceDiscovery::Make
  • ARROW-7387 - [C#] Support ListType Serialization
  • ARROW-7392 - [Packaging] Add conda packaging tasks for python 3.8
  • ARROW-7398 - [Packaging][Python] Conda builds are failing on macOS
  • ARROW-7399 - [C++][Gandiva] set Mcpu based on host cpu
  • ARROW-7402 - [C++] Add more information on CUDA error
  • ARROW-7403 - [C++][JSON] Enable Rapidjson on Arm64 Neon
  • ARROW-7410 - [Doc][Python] Document filesystem API
  • ARROW-7411 - [C++][Flight] Improve the output of Arrow Flight benchmark
  • ARROW-7413 - [Python] Expose and test the partioning discovery
  • ARROW-7414 - [R][Dataset] Implement *PartitionSchemeDiscovery in R
  • ARROW-7415 - [C++][Dataset] implement IpcFormat
  • ARROW-7416 - [R][Nightly] Fix macos-r-autobrew build on R 3.6.2
  • ARROW-7417 - [C++] Add a docker-compose entry for CUDA 10.1
  • ARROW-7418 - [C++] Fix build error on Ubuntu 16.04
  • ARROW-7420 - [C++] Migrate tensor related APIs to Result-returning version
  • ARROW-7429 - [Java] Enhance code style checking for Java code (remove consecutive spaces)
  • ARROW-7430 - [Python] Add more docstrings to dataset bindings
  • ARROW-7431 - [Python] Add dataset API to reference docs
  • ARROW-7432 - [Python] Add higher level open_dataset function
  • ARROW-7439 - [C++][Dataset] Remove pointer aliases
  • ARROW-7449 - [GLib] Make GObject Introspection optional
  • ARROW-7452 - [GLib] Make GArrowTimeDataType abstract
  • ARROW-7453 - [Ruby]
  • ARROW-7454 - [Ruby] Add support for saving/loading TSV
  • ARROW-7455 - [Ruby] Use Arrow::DataType.resolve for all GArrowDataType input
  • ARROW-7456 - [C++] Add support for YYYY-MM-DDThh and YYYY-MM-DDThh:mm timestamp formats
  • ARROW-7457 - [Doc] fix typos
  • ARROW-7459 - [Python] Fix document lint error
  • ARROW-7460 - [Rust] Improve some kernel performance
  • ARROW-7461 - [Java] fix typos
  • ARROW-7463 - [Doc] fix a broken link and typo
  • ARROW-7464 - [C++] Refine CpuInfo singleton with std::call_once
  • ARROW-7465 - [C++] Add Arrow memory benchmark for Arm64
  • ARROW-7468 - [Python] fix typos
  • ARROW-7469 - [C++] Improve division related bit operations
  • ARROW-7470 - [JS] fix typos
  • ARROW-7474 - [Ruby] Improve CSV save performance
  • ARROW-7475 - [Rust] Arrow IPC Stream writer
  • ARROW-7477 - [Java][FlightRPC] set up gRPC reflection metadata
  • ARROW-7479 - [Rust][Ruby][R] Fix typos
  • ARROW-7481 - [C#] fix typo
  • ARROW-7482 - [C++] Fix typos
  • ARROW-7484 - [C++][Gandiva] Fix typos
  • ARROW-7485 - [C++][Prasma] Fix typos
  • ARROW-7487 - [Developer] Fix typos
  • ARROW-7488 - [GLib] Fix typos and broken links
  • ARROW-7489 - [CI] Fix typos
  • ARROW-7490 - [Java] Avro converter should convert attributes and props to FieldType metadata
  • ARROW-7493 - [Python] Expose sum kernel in pyarrow.compute and support ChunkedArray inputs
  • ARROW-7498 - [Dataset] Rename core classes before stable API
  • ARROW-7502 - [Integration] Remove Spark patch not needed
  • ARROW-7513 - [JS][tutorial] - Rich cols part 1
  • ARROW-7514 - [C#] Make GetValueOffset Obsolete
  • ARROW-7519 - [Python] Build wheels, conda packages with dataset support
  • ARROW-7521 - [Rust] Remove tuple on FixedSizeList
  • ARROW-7523 - [Developer] Relax clang-tidy check
  • ARROW-7526 - [C++][Compute] Optimize small integer sorting
  • ARROW-7532 - [CI] Unskip brew test after Homebrew fixes it upstream
  • ARROW-7537 - [CI][R] Nightly macOS autobrew job should be more verbose if it fails
  • ARROW-7538 - [Java] Clarify actual and desired size in AllocationManager
  • ARROW-7540 - [C++] Install license files and README
  • ARROW-7541 - [GLib] Install license files
  • ARROW-7542 - [CI][C++] Use $(sysctl -n hw.ncpu) instead of $(nproc) on macOS
  • ARROW-7549 - [Java] Reorganize Flight modules to keep top level clean/organized
  • ARROW-7550 - [R][CI] Run donttest examples in CI
  • ARROW-7557 - [C++][Compute] Validate sorting stability
  • ARROW-7558 - [Packaging][deb][RPM] Use the host owner and group for artifacts
  • ARROW-7560 - [Rust] Reduce Rc/Refcell usage
  • ARROW-7565 - [Website] Add support for download URL redirect
  • ARROW-7566 - [CI] Use more recent Miniconda on AppVeyor
  • ARROW-7567 - [Java] Fix races in checkstyle upgdae
  • ARROW-7567 - [Java] Bump Checkstyle from 6.19 to 8.19
  • ARROW-7568 - [Java] Bump Apache Avro from 1.9.0 to 1.9.1
  • ARROW-7569 - [Python] Add API to map Arrow types to pandas ExtensionDtypes in to_pandas conversions
  • ARROW-7570 - [Java] Fix high severity issues
  • ARROW-7571 - [Java] Correct minimal Java version on README
  • ARROW-7572 - [Java] Enforce Maven 3.3+ as mentioned in README
  • ARROW-7573 - [Rust] Reduce boxing and cleanup
  • ARROW-7575 - [R] Linux binary packaging followup
  • ARROW-7576 - [C++][Dev] Improve fuzzing setup
  • ARROW-7577 - [CI][C++] Check OSS-Fuzz build in Github Actions
  • ARROW-7578 - [R] Add support for datasets with IPC files and with multiple sources
  • ARROW-7580 - [Website] 0.16 release post
  • ARROW-7581 - [R] Documentation/polishing for 0.16 release
  • ARROW-7590 - [C++] Don't ignore managed files in thirdparty
  • ARROW-7597 - [C++] More compact CMake configuration summary
  • ARROW-7600 - [C++][Parquet] failing disabled unittest for nested parquet.
  • ARROW-7601 - [Doc][C++] Update fuzzing doc
  • ARROW-7602 - [Archery] Add more archery build options
  • ARROW-7613 - [Rust] Remove redundant :: prefixes
  • ARROW-7622 - [Format] Mark Tensor and SparseTensor fields required
  • ARROW-7623 - [C++] Update generated flatbuffers code
  • ARROW-7626 - [Parquet][GLib] Add support for version macros
  • ARROW-7627 - [C++][Gandiva] Optimize string truncate function
  • ARROW-7629 - [C++][CI] Add fuzz regression files to arrow-testing
  • ARROW-7630 - [C++][CI] Check fuzz crash regressions in CI
  • ARROW-7632 - [C++][CI] Add extension type data to IPC fuzz seed corpus
  • ARROW-7635 - [C++] Add pkg-config support for each components
  • ARROW-7636 - [Python] Clean-up the pyarrow.dataset.partitioning() API
  • ARROW-7644 - Add vcpkg installation instructions
  • ARROW-7645 - [Packaging][deb][RPM] Fix arm64 packaging build
  • ARROW-7648 - [C++] Sanitize local paths on Windows
  • ARROW-7658 - [R] Support dplyr filtering on date/time
  • ARROW-7659 - [Rust] Reduce Rc usage
  • ARROW-7660 - [C++][Gandiva] Optimise castVarchar(string, int) function for single byte characters
  • ARROW-7665 - [R] Build in parallel in linuxLibs.R
  • ARROW-7666 - [Packaging][deb] Always use Ninja to reduce build time
  • ARROW-7667 - [Packaging][deb] Add ubuntu-eoan to nightly jobs
  • ARROW-7668 - [Packaging][RPM] Use Ninja if possible to reduce build time
  • ARROW-7670 - [Python][Dataset] More ergonomical API
  • ARROW-7671 - [Python][Dataset] Add bindings for the DatasetFactory
  • ARROW-7674 - [Dev] Add helpful message for captcha challenge in merge_arrow_pr.py
  • ARROW-7682 - [Packaging] Add support for arm64 APT/Yum repositories
  • ARROW-7683 - [Packaging] Set 0.16.0 as the next version
  • ARROW-7686 - [Packaging][deb][RPM] Include more arrow-*.pc
  • ARROW-7687 - [C++] Fix dead links in README
  • ARROW-7692 - [Rust] Simplify some Option / Result pattern matches
  • ARROW-7694 - [Packaging][deb][RPM] Add support for RC to repository packages
  • ARROW-7695 - [Release] Update java versions to 0.16-SNAPSHOT
  • ARROW-7696 - [Release] Add support for running unit test on release branch
  • ARROW-7697 - [Release] Add a test for updating Linux packages by 00-prepare.sh
  • ARROW-7710 - [Release][C#] Add support for redirecting .NET download URL
  • ARROW-7711 - [C#] Make Date32 test independent of system timezone
  • ARROW-7715 - [Release][APT] Ignore some arm64 verifications
  • ARROW-7716 - [Packaging][APT] Use the "main" component for Ubuntu 19.10
  • ARROW-7719 - [Python][Dataset] Table equality check occasionally fails
  • ARROW-7724 - [Release][Yum] Ignore some arm64 verifications
  • ARROW-7743 - [Rust] [Parquet] Support reading timestamp micros
  • ARROW-7768 - [Rust] Implement Length and TryClone traits for Cursor<Vec<u8>> in reader.rs
  • ARROW-8015 - [Python] Build 0.16.0 wheel install for Windows + Python 3.5 and publish to PyPI
  • PARQUET-517 - [C++] Use arrow::MemoryPool for all heap allocations
  • PARQUET-1300 - [C++] Implement encrypted Parquet read and write support
  • PARQUET-1664 - [C++] Provide API to return metadata string from FileMetadata.
  • PARQUET-1678 - [C++] Provide classes for reading/writing using input/output operators
  • PARQUET-1688 - [C++] StreamWriter/StreamReader can't be built with g++ 4.8.5 on CentOS 7
  • PARQUET-1689 - [C++] Stream API: Allow for columns/rows to be skipped when reading
  • PARQUET-1701 - [C++] Stream API: Add support for optional fields
  • PARQUET-1704 - [C++] Add re-usable encryption buffer to SerializedPageWriter
  • PARQUET-1705 - [C++] Disable shrink-to-fit on the re-usable decryption buffer
  • PARQUET-1712 - [C++] Stop using deprecated APIs in examples
  • PARQUET-1721 - [C++][Parquet] Add missing arrow dependency to parquet.pc
  • PARQUET-1734 - [C++] Fix typo
  • PARQUET-1769 - [C++] Update parquet.thrift to parquet-format 2.8.0
kszucs
published 0.15.1 •

Changelog

Source

Apache Arrow 0.15.1 (2019-11-01)

Bug Fixes

  • ARROW-6464 - [Java] Refactor FixedSizeListVector#splitAndTransfer with slice API (#5293)
  • ARROW-6728 - [C#] Support reading and writing Date32 and Date64 arrays
  • ARROW-6740 - [C++] Unmap MemoryMappedFile as soon as possible
  • ARROW-6762 - [C++] Support reading JSON files with no newline at end
  • ARROW-6795 - [C#] Fix for reading large (2GB+) files
  • ARROW-6806 - [C++][Python] Fix crash validating an IPC-originating empty array
  • ARROW-6809 - [RUBY] Gem does not install on macOS due to glib2 3.3.7 compilation failure
  • ARROW-6813 - [Ruby] Arrow::Table.load with headers=true leads to exception in Arrow 0.15
  • ARROW-6834 - [C++][TRIAGE] Pin gtest version 1.8.1 to unblock Appveyor builds
  • ARROW-6844 - [C++][Parquet] Fix regression in reading List types with item name that is not "item"
  • ARROW-6857 - [C++] Fix DictionaryEncode for zero-chunk ChunkedArray
  • ARROW-6860 - [Python][C++] Do not link shared libraries monolithically to pyarrow.lib, add libarrow_python_flight.so
  • ARROW-6861 - [C++] Fix length/null_count/capacity accounting through Reset and AppendIndices in DictionaryBuilder
  • ARROW-6869 - [C++] Do not return invalid arrays from DictionaryBuilder::Finish when reusing builder. Add "FinishDelta" method and "ResetFull" method
  • ARROW-6873 - [Python] Remove stale CColumn references
  • ARROW-6874 - [Python] Fix memory leak when converting to Pandas object data
  • ARROW-6876 - [C++][Parquet] Use shared_ptr to avoid copying ReaderContext struct, fix performance regression with reading many columns
  • ARROW-6877 - [C++] Add additional Boost versions to support 1.71 and the presumed next 2 future versions
  • ARROW-6878 - [Python] Fix creating array from list of dicts with bytes keys
  • ARROW-6882 - [C++] Ensure the DictionaryArray indices has no dictionary data
  • ARROW-6886 - [C++] Fix arrow::io nvcc compiler warnings
  • ARROW-6898 - [Java] Fix potential memory leak in ArrowWriter and several test classes
  • ARROW-6903 - [Python] Attempt to fix Python wheels with introduction of libarrow_python_flight, disabling of pyarrow.orc
  • ARROW-6905 - [Gandiva][Crossbow] Use xcode9.4 for osx builds, do not build dataset, filesystem
  • ARROW-6910 - [C++][Python] Set jemalloc default configuration to release dirty pages more aggressively back to the OS dirty_decay_ms and muzzy_decay_ms to 0 by default, add C++ / Python option to configure this
  • ARROW-6922 - [Python] Compat with pandas for MultiIndex.levels.names
  • ARROW-6937 - [Packaging][Python] Fix conda linux and OSX wheel nightly builds
  • ARROW-6938 - [Packaging][Python] Disable bz2 in Windows wheels and build ZSTD in bundled mode to triage linking issues
  • ARROW-6962 - [C++][CI] Stop compiling with -Weverything
  • ARROW-6977 - [C++] Disable jemalloc background_thread on macOS
  • ARROW-6983 - [C++] Fix ThreadedTaskGroup lifetime issue
  • ARROW-7422 - [Python] Improper CPU flags failing pyarrow install in ARM devices
  • ARROW-7423 - Pyarrow ARM install fails from source with no clear error
  • ARROW-9349 - [Python] parquet.read_table causes crashes on Windows Server 2016 w/ Xeon Processor

New Features and Improvements

  • ARROW-6610 - [C++] Add cmake option to disable filesystem layer
  • ARROW-6661 - [Java] Implement APIs like slice to enhance VectorSchemaRoot (#5470)
  • ARROW-6777 - [GLib][CI] Unpin gobject-introspection gem
  • ARROW-6852 - [C++] Fix build issue on memory-benchmark
  • ARROW-6927 - [C++] Add gRPC version check
  • ARROW-6963 - [Packaging][Wheel][OSX] Use crossbow's command to deploy artifacts from travis builds
kszucs
published 0.15.0 •

Changelog

Source

Apache Arrow 0.15.0 (2019-10-05)

New Features and Improvements

  • ARROW-453 - [C++] Filesystem implementation for Amazon S3
  • ARROW-517 - [C++] array comparison, uses D**2 space Myers
  • ARROW-750 - [Format][C++] Add LargeBinary and LargeString types
  • ARROW-1324 - [C++] Add support for bundled Boost with MSVC
  • ARROW-1561 - [C++] Kernel implementations for IsIn
  • ARROW-1566 - [C++] Implement non-materializing sort kernels
  • ARROW-1741 - [C++] Add DictionaryArray::CanCompareIndices
  • ARROW-1786 - [Format] List expected on-wire buffer layouts for each kind of Arrow physical type in specification
  • ARROW-1789 - [Format] Consolidate specification documents and improve clarity for new implementation authors
  • ARROW-1875 - [Java] Write 64-bit ints as strings in integration test JSON files
  • ARROW-2006 - [C++] Add option to trim excess padding when writing IPC messages
  • ARROW-2431 - [Rust] Schema fidelity
  • ARROW-2769 - [Python] Deprecate and rename add_metadata methods
  • ARROW-2931 - [Crossbow] Windows builds are attempting to run linux and osx packaging tasks
  • ARROW-3032 - [C++] Clean up Numpy-related headers
  • ARROW-3204 - [R] Enable R package to be made available on CRAN
  • ARROW-3243 - [C++] Upgrade jemalloc to version 5
  • ARROW-3246 - [C++][Python][Parquet] Direct writing of DictionaryArray to Parquet columns, automatic decoding to Arrow
  • ARROW-3325 - [Python][FOLLOWUP] In Python 2.7, a class's doc member is not writable (#5018)
  • ARROW-3325 - [Python][Parquet] Add "read_dictionary" argument to parquet.read_table, ParquetDataset to enable direct-to-DictionaryArray reads
  • ARROW-3531 - [Python] add Schema.field() method / deprecate field_by_name
  • ARROW-3538 - [Python] ability to override the automated assignment of uuid for filenames when writing datasets
  • ARROW-3579 - [Crossbow] Unintuitive error message when remote branch has not been pushed
  • ARROW-3643 - [Rust] optimize BooleanBufferBuilder::append_slice
  • ARROW-3710 - [Crossbow][Python] Run nightly tests against pandas master
  • ARROW-3772 - [C++][Parquet] Write Parquet dictionary indices directly to DictionaryBuilder rather than routing through dense form
  • ARROW-3777 - [C++] Add Slow input streams and slow filesystem
  • ARROW-3817 - [R] Extract methods for RecordBatch and Table
  • ARROW-3829 - [Python] add arrow_array protocol to support third-party array classes in conversion to Arrow
  • ARROW-3943 - [R] Write vignette for R package
  • ARROW-4036 - [C++] Pluggable Status message, by exposing an abstract delegate class.
  • ARROW-4095 - [C++] Optimize DictionaryArray::Transpose() for trivial transpositions
  • ARROW-4111 - [Python] Create time types from Python sequences of integers
  • ARROW-4218 - [Rust][Parquet] Initial support for array reader.
  • ARROW-4220 - [Python] Add buffered IO benchmarks with simulated high latency, allow duck-typed files in input_stream/output_stream
  • ARROW-4365 - [Rust][Parquet] Implement arrow record reader.
  • ARROW-4398 - [C++][Python][Parquet] Improve BYTE_ARRAY PLAIN encoding write performance. Add BYTE_ARRAY write benchmarks
  • ARROW-4473 - [Website] Add instructions to do a test-deploy of Arrow website and fix bugs
  • ARROW-4507 - [Format] Create outline and introduction for new document.
  • ARROW-4508 - [Format] Copy content from Layout.rst to new document.
  • ARROW-4509 - [Format] Copy content from Metadata.rst to new document.
  • ARROW-4510 - [Format] copy content from IPC.rst to new document.
  • ARROW-4511 - [Format][Docs] Revamp Format documentation, consolidate columnar format docs into a more coherent single document. Add Versioning/Stability page
  • ARROW-4648 - [Doc] Add documentation about C++ file naming
  • ARROW-4648 - [C++] Use underscores in source file names
  • ARROW-4649 - [C++/CI/R] Add nightly job that tests the homebrew formula
  • ARROW-4752 - [Rust] Add explicit SIMD vectorization for the divide kernel
  • ARROW-4810 - [Format][C++] Add LargeList type
  • ARROW-4841 - [C++] Add arrowOptions.cmake with options used to build arrow
  • ARROW-4860 - [C++] Build AWS C++ SDK for Windows in conda-forge
  • ARROW-5134 - [R][CI] Run nightly tests against multiple R versions
  • ARROW-5211 - [Format] Missing documentation under `Dictionary encoding` section on MetaData page
  • ARROW-5216 - [CI] Add Appveyor badge to README
  • ARROW-5307 - [CI][GLib] Enable GTK-Doc
  • ARROW-5337 - [C++] Add RecordBatch::field method, possibly deprecate "column"
  • ARROW-5343 - [C++] Refactor dictionary unification to incremental interface, and use Buffer for transpose map allocations
  • ARROW-5344 - [C++] Use ArrayDataVisitor in dict-to-anything cast
  • ARROW-5351 - [Rust] Take kernel
  • ARROW-5358 - [Rust] Implement equality check for ArrayData and Array
  • ARROW-5380 - [C++] Fix memory alignment UBSan errors.
  • ARROW-5439 - [Java] Utilize stream EOS in File format
  • ARROW-5444 - [Release][Website] After 0.14 release, update what is an "official" release
  • ARROW-5458 - [C++] Apache Arrow parallel CRC32c computation optimization
  • ARROW-5480 - [Python] Add unit test asserting specifically that pandas.Categorical roundtrips to Parquet format without special options
  • ARROW-5483 - [Java] add ValueVector constructors that take Field object
  • ARROW-5494 - [Python] Create FileSystem bindings
  • ARROW-5505 - [R] Normalize file and class names, stop masking base R functions, add vignette, improve documentation
  • ARROW-5527 - [C++] Uses Buffer/Builder in HashTable and MemoTable
  • ARROW-5558 - [C++] Support Array::View on arrays with non-zero offset
  • ARROW-5559 - [C++] Add an IpcOptions structure
  • ARROW-5564 - [C++] Use uriparser from conda-forge
  • ARROW-5579 - [Java] Shade flatbuffers
  • ARROW-5580 - [C++][Gandiva] Correct definitions of timestamp functions in Gandiva
  • ARROW-5588 - [C++] Better support for building union arrays
  • ARROW-5594 - [C++] add UnionArrays support to Take/Filter kernels
  • ARROW-5610 - [Python] define extension types in Python
  • ARROW-5646 - [Crossbow][Documentation] Move the user guide to the Sphinx documentation
  • ARROW-5681 - [FlightRPC] Add Flight-specific error APIs
  • ARROW-5686 - [R] Review R Windows CI build
  • ARROW-5716 - [Developer] Improve merge PR script to attribute multiple authors
  • ARROW-5717 - [Python] Unify variable dictionaries when converting to pandas
  • ARROW-5719 - [Java] Support in-place vector sorting
  • ARROW-5722 - [Rust] Implement Debug for List/Struct/BinaryArray
  • ARROW-5734 - [Python] Dispatch to Table.from_arrays from pyarrow.table factory function
  • ARROW-5736 - [Format][C++] Support small bit-width indices in sparse tensor
  • ARROW-5741 - [JS] Make numeric vector from functions consistent with TypedArray.from
  • ARROW-5743 - [C++] Add cmake option and macros for enabling large memory tests
  • ARROW-5746 - [Website] Move website source out of apache/arrow
  • ARROW-5747 - [C++] Improve CSV header and column names options
  • ARROW-5758 - [C++][Gandiva][Java] Support casting decimals to varchar and vice versa
  • ARROW-5762 - [JS] Align Map type impl with the spec
  • ARROW-5777 - [C++] Add microbenchmark for some Decimal128 operations
  • ARROW-5778 - [Java] Extract the logic for vector data copying to the super classes
  • ARROW-5784 - [Release][GLib] Replace c_glib/ after running c_glib/autogen.sh in dev/release/02-source.sh
  • ARROW-5786 - [Release] Use arrow-jni profile to run "mvm release:perform"
  • ARROW-5788 - [Rust] Use both "path" and "version" for internal dependencies
  • ARROW-5789 - [C++] Minor fixes for warnings, remove unused ubsan.cc
  • ARROW-5792 - [Rust] Add TypeVisitor for parquet type.
  • ARROW-5798 - [Packaging][deb] Update doc architecture
  • ARROW-5800 - [R] Dockerize R Travis CI tests so they can be run anywhere via docker-compose
  • ARROW-5803 - [CI] Dockerize C++ with clang 7 Travis CI
  • ARROW-5812 - [Java] Refactor method name and param type in BaseIntVector
  • ARROW-5813 - [C++] Fix TensorEquals for different contiguous tensors
  • ARROW-5814 - [Java] Implement a <Object, int> HashMap for DictionaryEncoder
  • ARROW-5827 - [C++] Require c-ares CMake config
  • ARROW-5828 - [C++] Add required Protocol Buffers versions check
  • ARROW-5830 - [C++] Stop using memcmp in TensorEquals for tensors with float values
  • ARROW-5832 - [Java] Support search operations for vector data
  • ARROW-5833 - [C++] Factor out Status-enriching code
  • ARROW-5834 - [Java] Apply new hash map in DictionaryEncoder
  • ARROW-5835 - [Java] Support Dictionary Encoding for binary type
  • ARROW-5841 - [Website] Add 0.14.0 release note
  • ARROW-5842 - [Java] Revise the semantic of lastSet in ListVector
  • ARROW-5843 - [Java] Improve the readability and performance of BitVectorHelper#getNullCount
  • ARROW-5844 - [Java] Support comparison & sort for more numeric types
  • ARROW-5846 - [Java] Create Avro adapter module and add dependencies
  • ARROW-5853 - [Python] Expose boolean filter kernel on Array
  • ARROW-5861 - [Java] Initial implement to convert Avro record with primitive types
  • ARROW-5862 - [Java] Provide dictionary builder
  • ARROW-5864 - [Python] Simplify Result class cython wrapper
  • ARROW-5865 - [Release] Helper script to rebase PRs on master
  • ARROW-5866 - [C++] Remove duplicate library in cpp/Brewfile
  • ARROW-5867 - [C++][Gandiva] add support for cast int to decimal
  • ARROW-5872 - [C++][Gandiva] Support mod(double, double) function in Gandiva
  • ARROW-5876 - [C++][Python] add basic auth flight proto message to C++ and Python
  • ARROW-5877 - [FlightRPC] Fix Python<->Java auth issues
  • ARROW-5880 - [C++][Parquet] Use TypedBufferBuilder instead of ArrayBuilder in writer.cc
  • ARROW-5881 - [Java] Provide functionalities to efficiently determine if a validity buffer has completely 1 bits/0 bits
  • ARROW-5883 - [Java] Support dictionary encoding for List and Struct type
  • ARROW-5888 - [C++][Parquet][Python] Restore timezone metadata when original Arrow schema has been stored in Parquet metadata
  • ARROW-5891 - [C++][Gandiva] Remove duplicates in function registry
  • ARROW-5892 - [C++][Gandiva] Support function aliases
  • ARROW-5893 - [C++][Python][GLib][Ruby][MATLAB][R] Remove arrow::Column class
  • ARROW-5897 - [Java] Remove duplicated logic in MapVector
  • ARROW-5898 - [Java] Provide functionality to efficiently compute hash code for arbitrary memory segment
  • ARROW-5900 - [Java] Bounds check for decimal args.
  • ARROW-5901 - [Rust] Add equals to json arrays.
  • ARROW-5902 - [Java] Implement hash table and equals & hashCode API for dictionary encoding
  • ARROW-5903 - [Java] Optimise set methods in decimal vector
  • ARROW-5904 - [Java][Plasma] Fix compilation of Plasma Java client
  • ARROW-5906 - [CI] Turn off ARROW_VERBOSE_THIRDPARTY_BUILD by default in Docker builds
  • ARROW-5908 - [C#] ArrowStreamWriter doesn't align buffers to 8 bytes
  • ARROW-5909 - [Java] Optimize ByteFunctionHelpers equals & compare logic
  • ARROW-5911 - [Java] Make ListVector and MapVector create reader lazily
  • ARROW-5917 - [Java] Redesign the dictionary encoder
  • ARROW-5918 - [Java] Add get to BaseIntVector interface
  • ARROW-5919 - [R] Test R-in-conda as a nightly build
  • ARROW-5920 - [Java] Support sort & compare for all variable width vectors
  • ARROW-5924 - [Plasma] return a replica of GpuProcessHandle::ptr when create or get an object
  • ARROW-5934 - [Python] Bundle arrow's LICENSE with the wheels
  • ARROW-5937 - [Release] Stop parallel binary upload
  • ARROW-5938 - [Release] Create branch for adding release note automatically
  • ARROW-5939 - [Release] Add support for generating vote email template separately
  • ARROW-5940 - [Release] Add support for re-uploading sign/checksum for binary artifacts
  • ARROW-5941 - [Release] Avoid re-uploading already uploaded binary artifacts
  • ARROW-5943 - [GLib][Gandiva] Add support for function aliases
  • ARROW-5944 - [C++][Gandiva] Remove 'div' alias for 'divide'
  • ARROW-5945 - [Rust][DataFusion] Table trait can now be used to build real queries
  • ARROW-5947 - [Rust][DataFusion] Remove serde crate dependency
  • ARROW-5948 - [Rust] [DataFusion] create_logical_plan should not call optimizer
  • ARROW-5955 - [Plasma] Support setting memory quotas per plasma client for better isolation
  • ARROW-5957 - [C++][Gandiva] Implement div function in Gandiva
  • ARROW-5958 - [Python] Link zlib statically in the wheels
  • ARROW-5961 - [R] Be able to run R-only tests even without C++ library
  • ARROW-5962 - [CI][Python] Remove manylinux1 builds from Travis CI
  • ARROW-5967 - [Java] DateUtility#timeZoneList is not correct
  • ARROW-5970 - [Java] Provide pointer to Arrow buffer
  • ARROW-5974 - [C++] Support reading concatenated compressed streams
  • ARROW-5975 - [C++][Gandiva] support castTIMESTAMP(date)
  • ARROW-5976 - [C++] RETURN_IF_ERROR(ctx) should be namespaced
  • ARROW-5977 - [C++][Python] Allow specifying which columns to include
  • ARROW-5979 - [FlightRPC] Expose opaque (de)serialization of protocol types
  • ARROW-5985 - [Developer] Do not suggest setting Fix Version for patch releases by default
  • ARROW-5986 - [Java] Code cleanup for dictionary encoding
  • ARROW-5988 - [Java] Avro adapter implement simple Record type
  • ARROW-5997 - [Java] Support dictionary encoding for Union type
  • ARROW-5998 - [Java] Open a document to track the API changes
  • ARROW-6000 - [Python] Add support for LargeString and LargeBinary types
  • ARROW-6008 - [Release] Stop parallel binary artifacts upload
  • ARROW-6009 - [JS] Ignore NPM errors in the javascript release script
  • ARROW-6013 - [Java] Support range searcher
  • ARROW-6017 - [FlightRPC] Enable creating Flight Locations for unknown schemes
  • ARROW-6020 - [Java] Refactor ByteFunctionHelper#hash with new added ArrowBufHasher
  • ARROW-6021 - [Java] Extract copyFrom and copyFromSafe methods to ValueVector interface
  • ARROW-6022 - [Java] Support equals API in ValueVector to compare two vectors equal
  • ARROW-6023 - [C++][Gandiva] Add functions in Gandiva
  • ARROW-6024 - [Java] Provide more hash algorithms
  • ARROW-6026 - [Doc] Add CONTRIBUTING.md
  • ARROW-6030 - [Java] Efficiently compute hash code for ArrowBufPointer
  • ARROW-6031 - [Java] Support iterating a vector by ArrowBufPointer
  • ARROW-6034 - [C++][Gandiva] Add string functions in Gandiva
  • ARROW-6035 - [Java] Avro adapter support convert nullable value
  • ARROW-6036 - [GLib] Add support for skip rows and column_names CSV read option
  • ARROW-6037 - [GLib] Add a missing version macro
  • ARROW-6039 - [GLib] Add garrow_array_filter()
  • ARROW-6041 - [Website] Blog post announcing R library availability on CRAN
  • ARROW-6042 - [C++][Parquet] Add Dictionary32Builder that always returns 32-bit dictionary indices
  • ARROW-6045 - [C++] Add benchmark for double and float encoding/decoding, as well as NaN encoding
  • ARROW-6048 - [C++] Add ChunkedArray::View method that dispatches to Array::View
  • ARROW-6049 - [C++] Support view from one dictionary type to another in Array::View
  • ARROW-6053 - [Python] Fix pyarrow's RecordBatchStreamReader::Open2 type signature
  • ARROW-6063 - [FlightRPC] implement half-closed semantics for DoPut
  • ARROW-6065 - [C++][Parquet] Clean up parquet/arrow/reader.cc, reduce code duplication, improve readability
  • ARROW-6069 - [Rust][Parquet] Add converter.
  • ARROW-6070 - [Java] Avoid creating new schema before IPC sending
  • ARROW-6077 - [C++][Parquet] Build Arrow "schema tree" from Parquet schema to help with nested data implementation
  • ARROW-6078 - [Java] Implement dictionary-encoded subfields for List type
  • ARROW-6079 - [Java] Implement/test UnionFixedSizeListWriter for FixedSizeListVector
  • ARROW-6080 - [Java] Support search operation for BaseRepeatedValueVector
  • ARROW-6083 - [Java] Refactor Jdbc adapter consume logic
  • ARROW-6084 - [Python] Support LargeList
  • ARROW-6085 - [Rust][DataFusion] Add traits for physical query plan
  • ARROW-6086 - [Rust][DataFusion] Add support for partitioned Parquet data sources
  • ARROW-6087 - [Rust] [DataFusion] Implement parallel execution for CSV scan
  • ARROW-6088 - [Rust][DataFusion] Projection execution plan
  • ARROW-6089 - [Rust][DataFusion] Implement physical plan for "selection" operator
  • ARROW-6090 - [Rust][DataFusion] Physical plan for HashAggregate
  • ARROW-6093 - [Java] reduce branches in algo for first match in VectorRangeSearcher
  • ARROW-6094 - [FlightRPC] Add Flight RPC method getFlightSchema
  • ARROW-6096 - [C++] conditionally use boost regex for gcc < 4.9
  • ARROW-6097 - [Java] Avro adapter implement unions type
  • ARROW-6100 - [Rust] Pin to specific nightly rust for reproducible/stable builds
  • ARROW-6101 - [Rust][DataFusion] Parallel execution of physical query plan
  • ARROW-6102 - [Testing] Add partitioned CSV file to arrow-testing repo
  • ARROW-6104 - [Rust][DataFusion] Remove use of bare trait objects
  • ARROW-6105 - [C++][Parquet][Python] Add test case showing dictionary-encoded subfields in nested type
  • ARROW-6113 - [Java] Support vector deduplicate function
  • ARROW-6115 - [Python] Support LargeBinary and LargeString in conversion to python
  • ARROW-6118 - [Java] Replace google Preconditions with Arrow Preconditions
  • ARROW-6121 - [Tools] Improve merge tool ergonomics
  • ARROW-6125 - [Python] Remove Python APIs deprecated in 0.14.x and prior
  • ARROW-6127 - [Website] Add favicons and meta tags
  • ARROW-6128 - [C++] Suppress a class-memaccess warning
  • ARROW-6130 - [Release] Use 0.15.0 as the next release
  • ARROW-6134 - [C++][Gandiva] Add concat function in Gandiva
  • ARROW-6137 - [C++][Gandiva] Use snprintf instead of stringstream in castVARCHAR(timestamp)
  • ARROW-6137 - [C++][Gandiva] Change output format of castVARCHAR(timestamp) in Gandiva
  • ARROW-6138 - [C++] Add a basic (single RecordBatch) implementation of Dataset
  • ARROW-6139 - [Documentation][R] Build R docs (pkgdown) site and add to arrow-site
  • ARROW-6141 - [C++] Enable memory-mapping a file region
  • ARROW-6142 - [R] Install instructions on linux could be clearer
  • ARROW-6143 - [Java] Unify the copyFrom and copyFromSafe methods for all vectors
  • ARROW-6144 - [C++][Gandiva] Implement random functions in Gandiva
  • ARROW-6155 - [Java] Extract a super interface for vectors whose elements reside in continuous memory segments
  • ARROW-6156 - [Java] Support compare semantics for ArrowBufPointer
  • ARROW-6161 - [C++][Dataset] Implements ParquetFragment
  • ARROW-6162 - [C++][Gandiva] Do not truncate string in castVARCHAR_utf8 if output length is zero
  • ARROW-6164 - [Docs][Format] Document project versioning schema and forward/backward compatibility policies
  • ARROW-6172 - [Java] Provide benchmarks to set IntVector with different methods
  • ARROW-6177 - [C++] Add Array::Validate()
  • ARROW-6180 - [C++][Parquet] Add RandomAccessFile::GetStream that returns InputStream that reads a file segment independent of the file's state, fix concurrent buffered Parquet column reads
  • ARROW-6181 - [R] Only allow R package to install without libarrow on linux
  • ARROW-6183 - [R] Document that you don't have to use tidyselect if you don't want
  • ARROW-6185 - [Java] Provide hash table based dictionary builder
  • ARROW-6187 - [C++] Fallback to storage type when writing ExtensionType to Parquet
  • ARROW-6188 - [GLib] Add garrow_array_is_in()
  • ARROW-6192 - [GLib] Use the same SO version as C++
  • ARROW-6194 - [Java] Add non-static approach in DictionaryEncoder making it easy to extend and reuse
  • ARROW-6196 - [Ruby] Add support for building Arrow::TimeNNArray by .new
  • ARROW-6197 - [GLib] Add garrow_decimal128_rescale()
  • ARROW-6199 - [Java] Avro adapter avoid potential resource leak.
  • ARROW-6203 - [GLib] Add garrow_array_sort_to_indices()
  • ARROW-6204 - [GLib] Add garrow_array_is_in_chunked_array()
  • ARROW-6206 - [Java][Docs] Document environment variables/java properties
  • ARROW-6209 - [Java] Extract set null method to the base class for fixed width vectors
  • ARROW-6212 - [Java] Support vector rank operation
  • ARROW-6216 - [C++][Parquet] Expose codec compression level to user, add to Parquet writer properties
  • ARROW-6217 - [Website] Remove needless _site/ directory
  • ARROW-6219 - [Java] Add API for JDBC adapter that can convert less then the full result set at a time
  • ARROW-6220 - [Java] Add API to avro adapter to limit number of rows returned at a time.
  • ARROW-6225 - [Website] Update arrow-site/README and any other places to point website contributors in right direction
  • ARROW-6229 - [C++][Dataset] implement FileSystemBasedDataSource
  • ARROW-6230 - [R] Reading in Parquet files are 20x slower than reading fst files in R
  • ARROW-6231 - [C++] Allow generating CSV column names
  • ARROW-6232 - [C++] Rename Argsort kernel to SortToIndices
  • ARROW-6237 - [R] Allow compilation flags to be passed for R package with ARROW_R_CXXFLAGS
  • ARROW-6238 - [C++][Dataset] Implement SimpleDataSource, SimpleDataFragment and SimpleScanTask
  • ARROW-6240 - [Ruby] Arrow::Decimal128Array#get_value returns BigDecimal
  • ARROW-6242 - [C++][Dataset] Implement Dataset, Scanner and ScannerBuilder
  • ARROW-6243 - [C++][Dataset] Filter expressions
  • ARROW-6244 - [C++][Dataset] Add partition key to DataSource interface
  • ARROW-6246 - [Website] Add link to R documentation site
  • ARROW-6247 - [Java] Provide a common interface for float4 and float8 vectors
  • ARROW-6249 - [Java] Remove useless class ByteArrayWrapper
  • ARROW-6250 - [Java] Implement ApproxEqualsVisitor comparing approx for floating point
  • ARROW-6252 - [C++][Python] Add Array::Diff in C++ and Array.diff in Python to return diff as string
  • ARROW-6253 - [Python] Expose "enable_buffered_stream" option from parquet::ReaderProperties in pyarrow.parquet.read_table
  • ARROW-6258 - [R] Add macOS build scripts
  • ARROW-6260 - [Website] Use deploy key on Travis to build and push to asf-site
  • ARROW-6262 - [Developer] Show JIRA issue before merging
  • ARROW-6264 - [Java] There is no need to consider byte order in ArrowBufHasher
  • ARROW-6265 - [Java] Avro adapter implement Array/Map/Fixed type
  • ARROW-6267 - [Ruby] Add Arrow::Time for Arrow::Time{32,64}DataType value
  • ARROW-6271 - [Rust][DataFusion] Add example for running SQL against Parquet
  • ARROW-6272 - [Rust][DataFusion] Add register_parquet convenience method to ExecutionContext
  • ARROW-6278 - [R] Read parquet files from raw vector
  • ARROW-6279 - [Python] Add Table.slice, getitem support to match RecordBatch, Array, others
  • ARROW-6284 - [C++] Allow references in std::tuple when converting tuple to arrow array
  • ARROW-6287 - [Rust][DataFusion] TableProvider.scan() returns thread-safe BatchIterator
  • ARROW-6288 - [Java] Implement TypeEqualsVisitor comparing vector type equals considering names and metadata
  • ARROW-6289 - [Java] Add empty() in UnionVector to create instance
  • ARROW-6292 - [C++] Add option to use the mimalloc allocator
  • ARROW-6294 - [C++] Use hyphen for plasma-store-server executable
  • ARROW-6295 - [Rust][DataFusion] ExecutionError Cannot compare Float32 with Float64
  • ARROW-6296 - [Java] Cleanup JDBC interfaces and eliminate one memcopy for binary/varchar fields
  • ARROW-6297 - [Java] Compare ArrowBufPointers by unsinged integers
  • ARROW-6300 - [C++] Add Abort() method to streams
  • ARROW-6303 - [Rust] Add a feature to disable SIMD
  • ARROW-6304 - [Java][Doc] Add a description to each module
  • ARROW-6306 - [Java] Support stable sort by stable comparators
  • ARROW-6310 - [C++] Write 64-bit integers as strings in JSON integration test files
  • ARROW-6311 - [Java] Make ApproxEqualsVisitor accept DiffFunction to make it more flexible
  • ARROW-6313 - [Format] Tracking for ensuring flatbuffer serialized values are aligned in stream/files.
  • ARROW-6314 - [C#] Implement IPC message format alignment changes, provide backwards compatibility and "legacy" option to emit old message format
  • ARROW-6314 - [C++] Implement IPC message format alignment changes, provide backwards compatibility and "legacy" option to emit old message format
  • ARROW-6315 - [Java] Make change to ensure flatbuffer reads are aligned
  • ARROW-6316 - [Go] implement new ARROW format with 32b-aligned buffers
  • ARROW-6317 - [JS] Implement IPC message format alignment changes
  • ARROW-6318 - [Integration] Run tests against pregenerated files
  • ARROW-6319 - [C++] Move the core of NumericTensor<T>::Value() to Tensor::Value<T>()
  • ARROW-6326 - [C++] Nullable fields when converting std::tuple to Table
  • ARROW-6328 - [Developer][crossbow] Click.option-s should have help text
  • ARROW-6329 - [Format] Add a padding for Flatbuffer alignment, use 8-byte EOS
  • ARROW-6331 - [Java] Incorporate ErrorProne into the java build
  • ARROW-6334 - [Java] Improve the dictionary builder API to return the position of the value in the dictionary
  • ARROW-6335 - [Java] Improve the performance of DictionaryHashTable
  • ARROW-6336 - [Python] Add notes to pyarrow.serialize/deserialize to clarify that these functions do not read or write the standard IPC protocol
  • ARROW-6337 - [R] Changed as_tible to as_dataframe in the R package
  • ARROW-6338 - [R] Type function names don't match type names
  • ARROW-6342 - [Python] Add pyarrow.record_batch factory function with same basic API / semantics as pyarrow.table
  • ARROW-6346 - [GLib] Add garrow_array_view()
  • ARROW-6347 - [GLib] Add garrow_array_diff()
  • ARROW-6350 - [Ruby] Remove Arrow::Struct and use Hash instead
  • ARROW-6351 - [Ruby] Improve Arrow#values performance
  • ARROW-6353 - [Python][C++] Expose compression_level option to parquet.write_table
  • ARROW-6355 - [Java] Make range equal visitor reusable
  • ARROW-6356 - [Java] Avro adapter implement Enum type and nested Record
  • ARROW-6357 - [C++] Issue S3 file writes in the background by default
  • ARROW-6358 - [C++] Add FileSystem::DeleteDirContents
  • ARROW-6360 - [R] Update support for compression
  • ARROW-6362 - [C++] Allow customizing S3 credentials provider
  • ARROW-6365 - [R] Should be able to coerce numeric to integer with schema
  • ARROW-6366 - [Java] Make field vectors final explicitly
  • ARROW-6368 - [C++][Dataset] Add interface for "projecting" RecordBatch from one schema to another, inserting null values where needed
  • ARROW-6373 - [C++] Make FixedWidthBinaryBuilder consistent with other fixed width builders in zeroing memory when appending null batches
  • ARROW-6375 - [C++] Extend ConversionTraits to allow efficiently appending list values in STL API
  • ARROW-6379 - [C++] Write no IPC buffer metadata for NullType
  • ARROW-6381 - [C++] BufferOutputStream::Write does extra work that slows down small writes
  • ARROW-6383 - [Java] Report outstanding child allocators on close
  • ARROW-6384 - [C++] Bump dependency versions
  • ARROW-6385 - [C++] Use xxh3 instead of custom hashing code for non-tiny strings
  • ARROW-6391 - [Python][Flight] Add built-in methods on FlightServerBase to start server and wait for it to be available
  • ARROW-6397 - [C++][CI] Generate minio server connect string
  • ARROW-6401 - [Java] Implement dictionary-encoded subfields for Struct type
  • ARROW-6402 - [C++] Suppress sign-compare warning with g++ 9.2.1
  • ARROW-6403 - [Python] Expose FileReader::ReadRowGroups() to Python
  • ARROW-6408 - [Rust] use "if cfg!" pattern
  • ARROW-6413 - [R] Support autogenerating column names
  • ARROW-6415 - [R] Remove usage of R CMD config CXXCPP
  • ARROW-6416 - [Python] Improve API & documentation regarding chunksizes
  • ARROW-6417 - [C++][Parquet] Miscellaneous optimizations yielding slightly better Parquet binary read performance
  • ARROW-6419 - [Website] Blog post about Parquet dictionary performance work coming in 0.15.x release
  • ARROW-6422 - [Gandiva] Fix double-conversion linker issue
  • ARROW-6426 - [FlightRPC][C++][Java] Expose gRPC configuration knobs
  • ARROW-6427 - [GLib] Add support for column names autogeneration CSV read option
  • ARROW-6438 - [R] : Add bindings for filesystem API
  • ARROW-6447 - [C++] Allow rest of arrow_objlib to build in parallel while memory_pool.cc is waiting on jemalloc_ep
  • ARROW-6450 - [C++] Use 2x reallocation strategy in BufferBuilder instead of 1.5x
  • ARROW-6451 - [Format] Add clarifications to Columnar.rst about the contents of "null" slots in Varbinary or List arrays
  • ARROW-6453 - [C++] More informative error messages with S3
  • ARROW-6454 - [LICENSE] Add LLVM's license due to static linkage
  • ARROW-6458 - [Java] Remove value boxing/unboxing for ApproxEqualsVisitor
  • ARROW-6460 - [Java] Add benchmark and large fake data UT for avro adapter
  • ARROW-6462 - [C++] Fix build error on CentOS 6 x86_64 with bundled double-conversion
  • ARROW-6465 - [Python] Improvement to Windows build instructions
  • ARROW-6474 - [Python] Add option to use legacy / pre-0.15 IPC message format and to set the default using PYARROW_LEGACY_IPC_FORMAT environment variable
  • ARROW-6475 - [C++] Don't try to dictionary encode dictionary arrays
  • ARROW-6477 - [Packaging][Crossbow] Use Azure Pipelines to build linux packages
  • ARROW-6480 - [Crossbow] Summary report e-mailer with polling logic
  • ARROW-6484 - [Java] Enable create indexType for DictionaryEncoding according to dictionary value count
  • ARROW-6487 - [Rust][DataFusion] Introduce common test module
  • ARROW-6489 - [Developer][Documentation] Fix merge script and readme
  • ARROW-6490 - [Java][Memory] Log error for leak in allocator close
  • ARROW-6491 - [Java][Hotfix] fix master fail caused by ErrorProne
  • ARROW-6494 - [C++][Dataset] Implement basic PartitionScheme
  • ARROW-6504 - [Python][Packaging] Add mimalloc to conda packages for better performance
  • ARROW-6505 - [Website] Add new committers
  • ARROW-6518 - [Packaging][Python] Flight failing in OSX Python wheel builds
  • ARROW-6519 - [Java] Use IPC continuation prefix as part of 8-byte EOS
  • ARROW-6524 - [Developer][Packaging] Nightly build report's subject should contain Arrow
  • ARROW-6525 - [C++] Avoid aborting in CloseFromDestructor()
  • ARROW-6526 - [C++] Poison data in debug mode
  • ARROW-6527 - [C++] Add OutputStream::Write(Buffer)
  • ARROW-6531 - [Python] Add detach() method to buffered streams
  • ARROW-6532 - [R] write_parquet() uses writer properties (general and arrow specific)
  • ARROW-6533 - [R] Compression codec should take a "level"
  • ARROW-6534 - [Java] Fix typos and spelling
  • ARROW-6539 - [R] Provide mechanism to write out old format
  • ARROW-6540 - [R] Add Validate() methods
  • ARROW-6541 - [Format][C++] Update Columnar.rst for two-part EOS, update C++ implementation
  • ARROW-6542 - [R] : Add View() method to array types
  • ARROW-6544 - [R] Documentation/polishing for 0.15 release
  • ARROW-6545 - [Go] update IPC writer to use two-part EOS
  • ARROW-6546 - [C++] Add missing FlatBuffers source dependency
  • ARROW-6549 - [C++] Switch to jemalloc 5.2.x
  • ARROW-6556 - [Python] Fix warning for pandas SparseDataFrame removal
  • ARROW-6556 - [Python] Handle future removal of pandas SparseDataFrame
  • ARROW-6557 - [Python] Always return pandas.Series from Array/ChunkedArray.to_pandas. Add mechanism to preserve "column names" from RecordBatch, Table as Series.name
  • ARROW-6558 - [C++] Refactor Iterator to type erased handle
  • ARROW-6559 - [Developer][C++] Add option to pass ARROW_PACKAGE_PREFIX when using 'archery benchmark'
  • ARROW-6563 - [Rust][DataFusion] MergeExec
  • ARROW-6569 - [Website] Add support for auto deployment by GitHub Actions
  • ARROW-6570 - [Python] Use Arrow's allocators for creating NumPy array instead of leaving it to NumPy
  • ARROW-6580 - [Java] Support comparison for unsigned integers
  • ARROW-6584 - [Python][Wheel] Bundle zlib again with the windows wheels
  • ARROW-6588 - [C++] Suppress class-memaccess warning with g++ 9.2.1
  • ARROW-6589 - [C++] Error propagation, tests for /MakeArray(OfNulls|FromScalar)/
  • ARROW-6590 - [C++] Do not require ARROW_JSON to build ARROW_IPC when unit tests are off
  • ARROW-6591 - [R] Ignore .Rhistory files in source control
  • ARROW-6599 - [Rust][DataFusion] Add aggregate traits and SUM implementation to physical query plan
  • ARROW-6601 - [Java] Improve JDBC adapter performance & add benchmark
  • ARROW-6605 - [C++][Filesystem] Add recursion depth control to fs::Selector
  • ARROW-6606 - [C++] Add PathTree tree structure
  • ARROW-6609 - [C++] Add Dockerfile for minimal C++ build
  • ARROW-6613 - [C++] Remove dependency on boost::filesystem
  • ARROW-6614 - [C++][Dataset] Implement FileSystemDataSourceDiscovery
  • ARROW-6616 - [Website] Release announcement blog post for 0.15
  • ARROW-6621 - [Rust][DataFusion] Run DataFusion examples in CI
  • ARROW-6629 - [Doc][C++] Add filesystem docs
  • ARROW-6630 - [Doc] Document C++ file formats
  • ARROW-6644 - [JS] Amend NullType IPC protocol to append no buffers
  • ARROW-6647 - [C++] Stop using member initializer for shared_ptr
  • ARROW-6648 - [Go] Expose the bitutil package
  • ARROW-6649 - [R] print methods for Array, ChunkedArray, Table, RecordBatch
  • ARROW-6653 - [Developer] Add support for auto JIRA link on pull request
  • ARROW-6655 - [Python] Filesystem bindings for S3
  • ARROW-6664 - [C++] Add CMake option to build without SSE4.2 instructions
  • ARROW-6665 - [Rust][DataFusion] Implement physical expression for numeric literal types
  • ARROW-6667 - [Python] remove cyclical object references in pyarrow.parquet
  • ARROW-6668 - [Rust][DataFusion] Implement CAST expression
  • ARROW-6669 - [Rust][DataFusion] Implement binary expression for physical plan
  • ARROW-6675 - [JS] Add scanReverse function to dataFrame and filteredDataframe
  • ARROW-6683 - [Python] Test for fastparquet <-> pyarrow cross-compatibility
  • ARROW-6725 - [CI] Disable 3rdparty fuzzit nightly builds
  • ARROW-6735 - [C++] Suppress sign-compare warning with g++ 9.2.1
  • ARROW-6752 - [Go] implement Stringer for Null array
  • ARROW-6755 - [Release] Improvements to Windows release verification script
  • ARROW-6771 - [Packaging][Python] Missing pytest dependency from conda and wheel builds
  • PARQUET-1468 - [C++] Clean up ColumnReader/internal::RecordReader code duplication

Bug Fixes

  • ARROW-1184 - [Java] Dictionary.equals is not working correctly
  • ARROW-2041 - [Python] pyarrow.serialize has high overhead for list of NumPy arrays
  • ARROW-2248 - [Python] Nightly or on-demand HDFS test builds
  • ARROW-2317 - [Python] Fix C linkage warning with Cython
  • ARROW-2490 - [C++] Normalize input stream concurrency
  • ARROW-3176 - [Python] Overflow in Date32 column conversion to pandas
  • ARROW-3203 - [C++] Build error on Debian Buster
  • ARROW-3651 - [Python] Handle 'datetime' logical type when reconstructing pandas columns from custom metadata
  • ARROW-3652 - [Python][Parquet] Add unit test exhibiting that pandas.CategoricalIndex survives roundtrip to Parquet format
  • ARROW-3762 - [Python] Add large_memory unit test exercising BYTE_ARRAY overflow edge cases from ARROW-3762
  • ARROW-3933 - [C++][Parquet] Handle non-nullable struct children when reading Parquet file, better error messages
  • ARROW-4187 - [C++] Enable file-benchmark on Windows
  • ARROW-4746 - [C++/Python] PyDataTime_Date wrongly casted to PyDataTime_DateTime
  • ARROW-4836 - [C++] Support Tell() on compressed streams
  • ARROW-4848 - [C++] Static libparquet not compiled with -DARROW_STATIC on Windows
  • ARROW-4880 - [Python] Rehabilitate ASV benchmark build scripts
  • ARROW-4883 - [Python] read_csv() returns garbage if given file object in text mode
  • ARROW-5028 - [Python] Avoid malformed ListArray types caused by reaching StringBuilder capacity when converting from Python sequence
  • ARROW-5072 - [Python] write_table fails silently on S3 errors
  • ARROW-5085 - [C++][Parquet][Python] Do not allow reading to dictionary type unless we have implemented support for it
  • ARROW-5086 - [Python][Parquet] Opt in to file memory-mapping when reading Parquet files rather than opting out
  • ARROW-5089 - [C++/Python] Writing dictionary encoded columns to parquet is extremely slow when using chunk size
  • ARROW-5103 - [Python] Segfault when using chunked_array.to_pandas on array different types (edge case)
  • ARROW-5125 - [Python] Round-trip extreme dates on windows
  • ARROW-5161 - [Python] Cannot convert struct type from Pandas object column
  • ARROW-5220 - [Python] Follow-up to improve error messages and docs for from_pandas schema argument
  • ARROW-5220 - [Python] Specified schema in from_pandas also includes the index
  • ARROW-5292 - [C++] Work around symbol visibility issues so building static libraries is not necessary when building unit tests on WIN32 platform
  • ARROW-5300 - [C++] Remove the ARROW_NO_DEFAULT_MEMORY_POOL macro
  • ARROW-5374 - [Python][C++] Improve ipc.read_record_batch docstring, fix IPC message type error messages generated in C++
  • ARROW-5414 - [C++] default to release build on windows
  • ARROW-5450 - [Python] Always return datetime.datetime in TimestampValue.as_py for units other than nanoseconds
  • ARROW-5471 - [C++][Gandiva] Array offset is ignored in Gandiva projector
  • ARROW-5522 - [Packaging][Documentation] Comments out of date in python/manylinux1/build_arrow.sh
  • ARROW-5525 - [C++] Add Continuous Fuzzing Integration setup with Fuzzit
  • ARROW-5560 - [C++][Plasma] Cannot create Plasma object after OutOfMemory error
  • ARROW-5562 - [C++][Parquet] Write negative zero or small epsilons as positive zero when computing Parquet statistics
  • ARROW-5630 - [C++][Parquet] Fix RecordReader accounting for repeated fields with non-nullable leaf
  • ARROW-5638 - [C++][CMake] Fixes for xcode project builds
  • ARROW-5651 - [Python] Fix Incorrect conversion from strided Numpy array
  • ARROW-5682 - [Python] Raise error when trying to convert non-string dtype to string
  • ARROW-5731 - [CI] Switch turbodbc branch for integration testing
  • ARROW-5753 - [Rust] Fix test failure in CI code coverage
  • ARROW-5772 - [GLib][Plasma][CUDA] Fix a bug that data can't be got
  • ARROW-5775 - [C++] Fix thread-unsafe cached data
  • ARROW-5776 - [Gandiva][Crossbow] Use commit id instead of fetch head.
  • ARROW-5790 - [Python] Raise error when trying to convert 0-dim array in pa.array
  • ARROW-5817 - [Python] Use pytest mark for flight tests
  • ARROW-5823 - [Rust] CI scripts miss --all-targets cargo argument
  • ARROW-5824 - [Gandiva][C++] Fix decimal null literals.
  • ARROW-5836 - [Java][FlightRPC] Skip Flight domain socket test when path too long
  • ARROW-5838 - [C++] Delegate OPENSSL_ROOT_DIR to bundled gRPC
  • ARROW-5848 - [C++] SO versioning schema after release 1.0.0
  • ARROW-5849 - [C++] Fix compiler warnings on mingw32
  • ARROW-5850 - [CI][R] R appveyor job is broken after release
  • ARROW-5851 - [C++] Fix compilation of reference benchmarks
  • ARROW-5856 - [Python][Packaging] Fix use of C++ / Cython API from wheels
  • ARROW-5860 - [Java][Vector] Fix decimal utils to handle negative values.
  • ARROW-5863 - [Python] Use atexit module for extension type finalization to avoid segfault
  • ARROW-5868 - [Python] Correctly remove liblz4 shared libraries from manylinux2010 image so lz4 is statically linked
  • ARROW-5870 - [C++][Docs] Refine source build instructions, do not tell people to install flex/bison if they don't need them
  • ARROW-5873 - [Python] Guard for passed None in Schema.equals
  • ARROW-5874 - [Python] Fix macOS wheels to depend on system or Homebrew OpenSSL
  • ARROW-5878 - [C++][Parquet] Restore pre-0.14.0 Parquet forward compatibility by adding option to unconditionally set TIMESTAMP_MICROS/TIMESTAMP_MILLIS ConvertedType
  • ARROW-5884 - [Java] Fix the get method of StructVector
  • ARROW-5886 - [Python][Packaging] Manylinux1/2010 compliance issue with libz
  • ARROW-5887 - [C#] ArrowStreamWriter writes FieldNodes in wrong order
  • ARROW-5889 - [C++][Parquet] Add property to indicate origin from converted type to TimestampLogicalType
  • ARROW-5894 - [Gandiva][C++] Added a linker script for libgandiva.so to restrict libstdc++ symbols.
  • ARROW-5899 - [Python][Packaging] Build and link uriparser statically in Windows wheel builds
  • ARROW-5910 - [Python] Support non-seekable streams in ipc.read_tensor, ipc.read_message, add Message.serialize_to method
  • ARROW-5921 - [C++] Fix multiple nullptr related crashes in IPC
  • ARROW-5923 - [C++][Parquet] Reword comment about UBSan and Int96 in writer.cc
  • ARROW-5925 - [Gandiva][C++] fix rounding in decimal to int cast
  • ARROW-5930 - [Python] Make Flight server init phase explicit
  • ARROW-5930 - [FlightRPC][Python] Disable Flight test causing segfault in Travis
  • ARROW-5935 - [C++] ArrayBuilder::type() should be kept accurate
  • ARROW-5946 - [Rust][DataFusion] Fix bug in projection push down logic
  • ARROW-5952 - [Python] fix conversion of chunked dictionary array with 0 chunks
  • ARROW-5959 - [CI] report branch+commit to fuzzit
  • ARROW-5960 - [C++] Fix Boost dependencies link order
  • ARROW-5963 - [R] R Appveyor job does not test changes in the C++ library
  • ARROW-5964 - [C++][Gandiva] Remove overflow check after rounding in BasicDecimal128::FromDouble
  • ARROW-5965 - [Python] Regression: segfault when reading hive table with v0.14
  • ARROW-5966 - [Python] Also use ChunkedStringBuilder when converting NumPy string types to Arrow StringType
  • ARROW-5968 - [Java] Remove duplicate Preconditions check in JDBC adapter
  • ARROW-5969 - [R] Fix R lint Failures
  • ARROW-5973 - [Java] Variable width vectors' get methods should return null when the underlying data is null
  • ARROW-5978 - [FlightRPC][Java] Properly release buffers in Flight integration client
  • ARROW-5989 - [C++] Accommodate openjdk-8 path search prefix
  • ARROW-5990 - [Python] add bounds check to RowGroupMetaData.column
  • ARROW-5992 - [C++][Python] Support String->Binary in Array::View. Add Python bindings for Array::View
  • ARROW-5993 - [Python] Reading a dictionary column from Parquet results in disproportionate memory usage
  • ARROW-5996 - [Java] Avoid potential resource leak in flight service
  • ARROW-5999 - [C++] decouple Iterator from ARROW_DATASETS
  • ARROW-6002 - [C++][Gandiva] test casting int64 to decimal
  • ARROW-6004 - [C++] Turn non-ignored empty CSV lines into null/empty values
  • ARROW-6005 - [C++] extend GetRecordBatchReader test to cover reading a single row group
  • ARROW-6006 - [C++] Do not fail to read empty IPC stream with schema having dictionary types
  • ARROW-6012 - [C++] Fall back on known Apache mirror for Thrift downloads
  • ARROW-6015 - [Python] Add note to python/README.md about installing Visual C++ Redistributable on Windows when using pip
  • ARROW-6016 - [Python] Fix get_library_dirs() when Arrow installed as a system package
  • ARROW-6029 - [R] Improve R docs on how to fix library version mismatch
  • ARROW-6032 - [C++] Ensure 64-bit pointer alignment in CountSetBits()
  • ARROW-6038 - [C++] Faster type equality
  • ARROW-6040 - [Java] Dictionary entries are required in IPC streams even when empty
  • ARROW-6046 - [C++] Do not write excess varbinary offsets in IPC messages from sliced BinaryArray
  • ARROW-6047 - [Rust] Rust nightly 1.38.0 builds failing
  • ARROW-6050 - [Java] Update out-of-date java/flight/README.md
  • ARROW-6054 - [Python] Fix the type erasion bug when serializing structured type ndarray.
  • ARROW-6058 - [C++][Parquet] Validate whole ColumnChunk raw data reads so that underlying filesystem issues are caught earlier
  • ARROW-6059 - [Python] Regression memory issue when calling pandas.read_parquet
  • ARROW-6060 - [C++] ChunkedBinaryBuilder should only grow when necessary, address runaway memory use in Parquet binary column read
  • ARROW-6061 - [C++] Add ARROW_JSON feature flag for configuring arrow builds without RapidJSON
  • ARROW-6066 - [Website] Fix blog post author header
  • ARROW-6067 - [Python] Fix failing large memory Python tests
  • ARROW-6068 - [C++] Allow passing Field instances to StructArray::Make
  • ARROW-6073 - [C++] Reset Decimal128Builder in Finish().
  • ARROW-6082 - [Python] check type of the index_type passed to pa.dictionary()
  • ARROW-6092 - [Python] Fix C++ arrow-python-test on Python 2.7
  • ARROW-6095 - [C++] Fix unit test build when only building static libraries, add cpp-static-only to tests.yml
  • ARROW-6108 - [C++] Workaround Windows CRT crash on invalid locale
  • ARROW-6116 - [C++][Gandiva] Fix bug in TimedTestFilterAdd2
  • ARROW-6117 - [Java] Fix the set method of FixedSizeBinaryVector
  • ARROW-6119 - [Python] PyArrow wheel import fails on Windows Python 3.7
  • ARROW-6120 - [C++] Forbid use of <iostream> in public header files
  • ARROW-6126 - [C++] Return error when an IPC stream terminates in the middle of receiving dictionaries
  • ARROW-6132 - [Python] validate result in ListArray.from_arrays
  • ARROW-6135 - [C++] Make KeyValueMetadata::Equals() order-insensitive
  • ARROW-6136 - [FlightRPC][Java] don't double-close response stream
  • ARROW-6145 - [Java] UnionVector created by MinorType#getNewVector could not keep field type info properly
  • ARROW-6148 - [Packaging] Improve aarch64 support
  • ARROW-6152 - [C++][Parquet] Add parquet::ColumnWriter::WriteArrow method, refactor
  • ARROW-6153 - [R] Address parquet deprecation warning
  • ARROW-6158 - [C++/Python] Validate child array types with type fields of StructArray
  • ARROW-6159 - [C++] Properly indent first line of PrettyPrint with Schema
  • ARROW-6160 - [Java] AbstractStructVector#getPrimitiveVectors fails to work with complex child vectors
  • ARROW-6166 - [Go] Fix index out of bounds panic when slicing a slice
  • ARROW-6167 - [R] macOS binary R packages on CRAN don't have arrow_available
  • ARROW-6168 - [C++] IWYU docker-compose job is broken
  • ARROW-6170 - [R] Faster docker-compose build
  • ARROW-6171 - [R][CI] Fix R library search path
  • ARROW-6174 - [C++] Validate chunks in ChunkedArray::Validate. Fix validation of sliced ListArray, values null checks
  • ARROW-6175 - [Java] Fix MapVector#getMinorType and extend AbstractContainerVector addOrGet complex vector API
  • ARROW-6178 - [Developer] Keep prompting for authors in merge script for multi-author PRs if given bad input
  • ARROW-6182 - [R] Add note to README about r-arrow conda installation
  • ARROW-6186 - [Packaging][deb] Add missing headers to libplasma-dev for Ubuntu 16.04
  • ARROW-6190 - [C++] Define and declare functions regardless of NDEBUG
  • ARROW-6193 - [GLib] Add missing require in test
  • ARROW-6200 - [Java] Method getBufferSizeFor in BaseRepeatedValueVector/ListVector not correct
  • ARROW-6202 - [Java] Add unit test for large resultsets
  • ARROW-6205 - [C++] ARROW_DEPRECATED warning when including io/interfaces.h
  • ARROW-6208 - [Java] Correct byte order before comparing in ByteFunctionHelpers
  • ARROW-6210 - [Java] remove equals API from ValueVector
  • ARROW-6211 - [Java] Remove dependency on RangeEqualsVisitor from ValueVector interface
  • ARROW-6214 - [R] Add R sanitizer docker image
  • ARROW-6215 - [Java] Fix case when ZeroVector is compared against other vector types
  • ARROW-6218 - [Java] Add UINT type test in integration to avoid potential overflow
  • ARROW-6223 - [C++] Configuration error with Anaconda Python 3.7.4
  • ARROW-6224 - [Python] fix deprecated usage of .data (previouly Column.data)
  • ARROW-6227 - [Python] Apply from_pandas option in pyarrow.array consistently across types
  • ARROW-6234 - [Java] ListVector hashCode() is not correct
  • ARROW-6241 - [Java] Failures on master
  • ARROW-6255 - [Rust] [Parquet] Cannot use any published parquet crate due to parquet-format breaking change
  • ARROW-6259 - [C++] Add -Wno-extra-semi-stmt when compiling with clang 8 to work around Flatbuffers bug, suppress other new LLVM 8 warnings
  • ARROW-6263 - [Python] Use RecordBatch::Validate in RecordBatch.from_arrays. Normalize API vs. Table.from_arrays. Add record_batch factory function
  • ARROW-6266 - [Java] Resolve the ambiguous method overload in RangeEqualsVisitor
  • ARROW-6268 - [Java] Empty buffers to have a valid address.
  • ARROW-6269 - [C++] check decimal precision in IPC code
  • ARROW-6270 - [C++] check buffer_index bounds in IpcComponentSource.GetBuffer
  • ARROW-6290 - [Rust][DataFusion] Fix bug in type coercion rule
  • ARROW-6291 - [C++] Do not override ARROW_PARQUET if other PARQUET options are enabled
  • ARROW-6293 - [Rust] datafusion 0.15.0-SNAPSHOT error
  • ARROW-6301 - [C++][Python] Prevent ExtensionType-related race condition in Python process teardown by exposing shared_ptr to global "ExtensionTypeRegistry"
  • ARROW-6302 - [C++][Parquet][Python] Restore ordered type property when reading dictionary type with serialized Arrow schema
  • ARROW-6309 - [C++][Parquet] Stop needless static linking
  • ARROW-6323 - [R] Expand file paths when passing to readers
  • ARROW-6325 - [Python] fix conversion of strided boolean arrays
  • ARROW-6330 - [C++] Include missing API headers
  • ARROW-6332 - [Java][C++][Gandiva] Misc fixes for varwidth vector allocation.
  • ARROW-6339 - [Python] Raise ValueError when accessing unset statistics
  • ARROW-6343 - [Java][Vector] Fix allocation helper.
  • ARROW-6344 - [C++][Gandiva] Handle multibyte characters in substring function
  • ARROW-6345 - [C++][Python] "ordered" flag seemingly not taken into account when comparing DictionaryType values for equality
  • ARROW-6348 - [R] arrow::read_csv_arrow namespace error when package not loaded
  • ARROW-6354 - [C++] Fix failing build when ARROW_PARQUET=OFF
  • ARROW-6363 - [R] segfault in Table__from_dots with unexpected schema
  • ARROW-6364 - [R] Handling unexpected input to time64() et al:
  • ARROW-6369 - [C++] Handle Array.to_pandas case for type=list<bool>
  • ARROW-6371 - [Doc] Row to columnar conversion example mentions arrow::Column in comments
  • ARROW-6372 - [Rust][Datafusion] Casting from Un-signed to Signed Integers not supported
  • ARROW-6376 - [Developer] Use target ref of PR when merging instead of hard-coding "master"
  • ARROW-6387 - [Archery] Errors with make
  • ARROW-6392 - [FlightRPC][Python] check type of list_flights result
  • ARROW-6395 - [Python] Bug when using bool arrays with stride greater than 1
  • ARROW-6406 - [C++] Fix jemalloc URL for offline build in thirdparty/versions.txt
  • ARROW-6411 - [Python][Parquet] Improve performance of DictEncoder::PutIndices
  • ARROW-6412 - [C++] Improve TCP port allocation in tests
  • ARROW-6418 - [C++][Plasma] Remove cmake project directive for plasma
  • ARROW-6423 - [C++] Fix crash when trying to instantiate Snappy CompressedOutputStream
  • ARROW-6424 - [C++] Fix IPC fuzzing test name
  • ARROW-6425 - [C++] ValidateArray fail for slice of list array
  • ARROW-6428 - [CI][Crossbow] Nightly turbodbc job fails
  • ARROW-6430 - [CI][Crossbow] Nightly R docker job fails
  • ARROW-6431 - [Python] Test suite fails without pandas installed
  • ARROW-6432 - [CI][Crossbow] Remove alpine nightly crossbow jobs
  • ARROW-6433 - [Java][CI] Fix java docker image
  • ARROW-6434 - [CI][Crossbow] Nightly HDFS integration job fails
  • ARROW-6435 - [Python] Use pandas null coding consistently on List and Struct types
  • ARROW-6440 - [Packaging][deb] Follow plasma-store-server name change
  • ARROW-6441 - [Packaging][RPM] Follow plasma-store-server name change
  • ARROW-6442 - [CI][Crossbow] Nightly gandiva jar osx build fails
  • ARROW-6443 - [CI][Crossbow] Nightly conda osx builds fail
  • ARROW-6444 - [CI][Crossbow] Nightly conda Windows builds fail (time out)
  • ARROW-6446 - [OSX][Python][Wheel] Turn off ORC feature in the wheel building scripts
  • ARROW-6449 - [R] io "tell()" methods are inconsistently named and untested
  • ARROW-6457 - [C++] Always set CMAKE_BUILD_TYPE if it is not defined
  • ARROW-6461 - [Java] Prevent EchoServer from closing the client socket after writing
  • ARROW-6472 - [Java] ValueVector#accept may has potential cast exception
  • ARROW-6476 - [Java][CI] Fix java docker build script
  • ARROW-6478 - [C++] Revert to jemalloc stable-4 until we understand 5.2.x performance issues
  • ARROW-6481 - [C++] Avoid copying large ConvertOptions
  • ARROW-6488 - [Python] fix equality with pyarrow.NULL to return NULL
  • ARROW-6492 - [Python] Handle pandas_metadata created by fastparquet with missing field_name
  • ARROW-6502 - [GLib][CI] Pin gobject-introspection gem to 3.3.7
  • ARROW-6506 - [C++] Fix validation of ExtensionArray with struct storage type
  • ARROW-6509 - [C++][Gandiva] Re-enable Gandiva JNI tests and fix Travis CI failure
  • ARROW-6509 - [Java][CI] Upgrade maven-surefire-plugin to version 3.0.0-M3, disable Gandiva JNI unit tests temporarily
  • ARROW-6520 - [Python] More consistent handling of specified schema when creating Table
  • ARROW-6522 - [Python] Fix failing pandas tests on older pandas / older python
  • ARROW-6530 - [CI][Crossbow][R] Nightly R job doesn't install all dependencies
  • ARROW-6550 - [C++] Filter expressions PR failing manylinux package builds
  • ARROW-6551 - [Python] Dask Parquet integration test failure
  • ARROW-6552 - [C++] boost::optional in STL test fails compiling in gcc 4.8.2
  • ARROW-6560 - [Python] Fix nopandas integration tests
  • ARROW-6561 - [Python] Fix python tests to pass on pandas master
  • ARROW-6562 - [GLib] Fix returning wrong sliced data of GArrowBuffer
  • ARROW-6564 - [Python] Do not require pandas for invoking Array.array
  • ARROW-6565 - [Rust][DataFusion] Fix intermittent test failure
  • ARROW-6568 - [C++] ChunkedArray constructor needs type when chunks is empty
  • ARROW-6572 - [C++] Fix Parquet decoding returning uninitialized data
  • ARROW-6573 - [Python] Add test case to probe additional behavior in schema-data mismatch in Table.from_pydict
  • ARROW-6576 - [R] Fix sparklyr integration tests
  • ARROW-6586 - [Python][Packaging] Windows wheel builds failing with "DLL load failure"
  • ARROW-6597 - [Python] Sanitize Python datetime handling
  • ARROW-6618 - [Python] Fix read_message() segfault on end of stream
  • ARROW-6620 - [Python][CI] pandas-master build failing due to removal of "to_sparse" method
  • ARROW-6622 - [R] Normalize paths for filesystem API on Windows
  • ARROW-6623 - [CI][Python] Dask docker integration test broken perhaps by statistics-related change
  • ARROW-6639 - [Packaging][RPM] Add support for CentOS 7 on aarch64
  • ARROW-6640 - [C++] Do not reset buffer_pos_ in BufferedInputStream/OutputStream when enlarging buffer
  • ARROW-6641 - [C++] Remove Deprecated WriteableFile warning
  • ARROW-6642 - [Python] Link parent objects in Parquet's metadata and statistics objects
  • ARROW-6651 - Fix conda R job
  • ARROW-6652 - [Python] Fix ChunkedArray.to_pandas to retain timezone
  • ARROW-6652 - [Python] Fix Array.to_pandas to retain timezone
  • ARROW-6660 - [Rust][DataFusion] Minor docs update for 0.15.0 release
  • ARROW-6670 - [CI][R] Fix fixes for R nightly jobs
  • ARROW-6674 - [Python] Fix or ignore the test warnings
  • ARROW-6677 - [FlightRPC][C++] Document Flight in C++
  • ARROW-6678 - [C++][Parquet] Binary data stored in Parquet metadata must be base64-encoded to be UTF-8 compliant
  • ARROW-6679 - [RELEASE] Add license info for the autobrew scripts
  • ARROW-6682 - [C#] Ensure file footer block lengths are always 8 byte aligned.
  • ARROW-6687 - [Rust][DataFusion] Add regression tests for np.nan parquet file
  • ARROW-6687 - [Rust][DataFusion] Bug fix in DataFusion Parquet reader
  • ARROW-6701 - [C++][R] Lint failing on R cpp code
  • ARROW-6703 - [Packaging][Linux] Restore ARROW_VERSION environment variable
  • ARROW-6705 - [Rust][DataFusion] README has invalid github URL
  • ARROW-6709 - [JAVA] Jdbc adapter currentIndex should increment when va…
  • ARROW-6714 - [R] Fix untested RecordBatchWriter case
  • ARROW-6716 - [Rust] Bump nightly to nightly-2019-09-25 to fix CI
  • ARROW-6748 - [RUBY] gem compilation error
  • ARROW-6751 - [CI] ccache doesn't cache on Travis-CI
  • ARROW-6760 - [C++] JSON: improve error message when column changed type
  • ARROW-6773 - [C++] Filter kernel returns invalid data when filtering with an Array slice
  • ARROW-6796 - Certain moderately-sized (~100MB) default-Snappy-compressed Parquet files take enormous memory and long time to load by pyarrow.parquet.read_table
  • ARROW-7112 - Wrong contents when initializinga pyarrow.Table from boolean DataFrame
  • PARQUET-1623 - [C++] Fix invalid memory access encountered when reading some parquet files
  • PARQUET-1631 - [C++] ParquetInputWrapper::GetSize returns Tell
  • PARQUET-1640 - [C++] Fix crash in parquet-encoding-benchmark
ptaylor
published 0.14.1 •

Changelog

Source

Apache Arrow 0.14.1 (2019-07-22)

Bug Fixes

  • ARROW-5775 - [C++] Fix thread-unsafe cached data
  • ARROW-5790 - [Python] Raise error when trying to convert 0-dim array in pa.array
  • ARROW-5791 - [C++] Fix infinite loop with more the 32768 columns.
  • ARROW-5816 - [Release] Do not curl in background in verify-release-candidate.sh
  • ARROW-5836 - [Java][FlightRPC] Skip Flight domain socket test when path too long
  • ARROW-5838 - [C++] Delegate OPENSSL_ROOT_DIR to bundled gRPC
  • ARROW-5849 - [C++] Fix compiler warnings on mingw32
  • ARROW-5850 - [CI][R] R appveyor job is broken after release
  • ARROW-5851 - [C++] Fix compilation of reference benchmarks
  • ARROW-5856 - [Python][Packaging] Fix use of C++ / Cython API from wheels
  • ARROW-5863 - [Python] Use atexit module for extension type finalization to avoid segfault
  • ARROW-5868 - [Python] Correctly remove liblz4 shared libraries from manylinux2010 image so lz4 is statically linked
  • ARROW-5873 - [Python] Guard for passed None in Schema.equals
  • ARROW-5874 - [Python] Fix macOS wheels to depend on system or Homebrew OpenSSL
  • ARROW-5878 - [C++][Parquet] Restore pre-0.14.0 Parquet forward compatibility by adding option to unconditionally set TIMESTAMP_MICROS/TIMESTAMP_MILLIS ConvertedType
  • ARROW-5886 - [Python][Packaging] Manylinux1/2010 compliance issue with libz
  • ARROW-5887 - [C#] ArrowStreamWriter writes FieldNodes in wrong order
  • ARROW-5889 - [C++][Parquet] Add property to indicate origin from converted type to TimestampLogicalType
  • ARROW-5899 - [Python][Packaging] Build and link uriparser statically in Windows wheel builds
  • ARROW-5921 - [C++] Fix multiple nullptr related crashes in IPC
  • PARQUET-1623 - [C++] Fix invalid memory access encountered when reading some parquet files

New Features and Improvements

  • ARROW-5101 - [Packaging] Avoid bundling static libraries in Windows conda packages
  • ARROW-5380 - [C++] Fix memory alignment UBSan errors.
  • ARROW-5564 - [C++] Use uriparser from conda-forge
  • ARROW-5609 - [C++] Set CMP0068 CMake policy to avoid macOS warnings
  • ARROW-5784 - [Release][GLib] Replace c_glib/ after running c_glib/autogen.sh in dev/release/02-source.sh
  • ARROW-5785 - [Rust] Make the datafusion cli dependencies optional
  • ARROW-5787 - [Release][Rust] Use local modules to verify RC
  • ARROW-5793 - [Release] Avoid duplicated known host SSH error in dev/release/03-binary.sh
  • ARROW-5794 - [Release] Skip uploading already uploaded binaries
  • ARROW-5795 - [Release] Add missing waits on uploading binaries
  • ARROW-5796 - [Release][APT] Update expected package list
  • ARROW-5797 - [Release][APT] Update supported distributions
  • ARROW-5820 - [Release] Remove undefined variable check from verify script
  • ARROW-5827 - [C++] Require c-ares CMake config
  • ARROW-5828 - [C++] Add required Protocol Buffers versions check
  • ARROW-5866 - [C++] Remove duplicate library in cpp/Brewfile
  • ARROW-5877 - [FlightRPC] Fix Python<->Java auth issues
  • ARROW-5904 - [Java][Plasma] Fix compilation of Plasma Java client
  • ARROW-5908 - [C#] ArrowStreamWriter doesn't align buffers to 8 bytes
  • ARROW-5934 - [Python] Bundle arrow's LICENSE with the wheels
  • ARROW-5937 - [Release] Stop parallel binary upload
  • ARROW-5938 - [Release] Create branch for adding release note automatically
  • ARROW-5939 - [Release] Add support for generating vote email template separately
  • ARROW-5940 - [Release] Add support for re-uploading sign/checksum for binary artifacts
  • ARROW-5941 - [Release] Avoid re-uploading already uploaded binary artifacts
  • ARROW-5958 - [Python] Link zlib statically in the wheels
kou
published 0.14.0 •

Changelog

Source

Apache Arrow 0.14.0 (2019-07-04)

New Features and Improvements

  • ARROW-258 - [Format] clarify definition of Buffer in context of RPC, IPC, File
  • ARROW-653 - [Python / C++] Add debugging function to print an array's buffer contents in hexadecimal
  • ARROW-767 - [C++] Filesystem abstraction
  • ARROW-835 - [Format][C++][Java] Create a new Duration type
  • ARROW-840 - [Python] Expose extension types
  • ARROW-973 - [Website] Add FAQ page
  • ARROW-1012 - [C++] Configurable batch size for parquet RecordBatchReader
  • ARROW-1207 - [C++] Implement MapArray, MapBuilder, MapType classes, and IPC support
  • ARROW-1261 - [Java] Add MapVector with reader and writer
  • ARROW-1278 - [Integration] Adding integration tests for fixed_size_list
  • ARROW-1279 - [Integration] Enable MapType integration tests
  • ARROW-1280 - [C++] add fixed size list type
  • ARROW-1349 - [Packaging] Provide APT and Yum repositories
  • ARROW-1496 - [JS] Upload coverage data to codecov.io
  • ARROW-1558 - [C++] Implement boolean filter (selection) kernel, rename comparison kernel-related functions
  • ARROW-1587 - [Format] Add metadata for user-defined logical types
  • ARROW-1774 - [C++] Add Array::View()
  • ARROW-1833 - [Java] Add accessor methods for data buffers that skip null checking
  • ARROW-1957 - [Python] Write nanosecond timestamps using new NANO LogicalType Parquet unit
  • ARROW-1983 - [C++][Parquet] Add AppendRowGroups and WriteMetaDataFile methods
  • ARROW-2057 - [Python] Expose option to configure data page size threshold in parquet.write_table
  • ARROW-2102 - [C++] Implement Take kernel
  • ARROW-2103 - [C++] Implement take kernel functions - string/binary value type
  • ARROW-2104 - [C++] take kernel functions for nested types
  • ARROW-2105 - [C++] Implement take kernel functions - properly handle special indices
  • ARROW-2186 - [C++] Clean up architecture specific compiler flags
  • ARROW-2217 - [C++] Add option to use dynamic linking for compression library dependencies
  • ARROW-2298 - [Python] Add unit tests to assert that float64 with NaN values can be safely coerced to integer types when converting from pandas
  • ARROW-2412 - [Integration] Add nested dictionary test case, skipped for now
  • ARROW-2467 - [Rust] Add generated IPC code
  • ARROW-2517 - [Java] Add list<decimal> writer
  • ARROW-2618 - [Rust] Bitmap constructor should accept for flag for default state (0 or 1)
  • ARROW-2667 - [C++/Python] Add pandas-like take method to Array
  • ARROW-2707 - [C++] Add Table::Slice
  • ARROW-2709 - [Python] write_to_dataset poor performance when splitting
  • ARROW-2730 - [C++] Set up CMAKE_C_FLAGS more thoughtfully instead of using CMAKE_CXX_FLAGS
  • ARROW-2796 - [C++] Simplify version script used for linking
  • ARROW-2818 - [Python] Better error message when trying to convert sparse pandas data to arrow Table
  • ARROW-2835 - [C++] Make file position undefined after ReadAt()
  • ARROW-2969 - [R] Convert between StructArray and "nested" data.frame column containing data frame in each cell
  • ARROW-2981 - [C++] improve clang-tidy usability
  • ARROW-2984 - [JS] Refactor release verification script to share code with main source release verification script
  • ARROW-3040 - [Go] add support for comparing Arrays
  • ARROW-3041 - [Go] add support for TimeArray
  • ARROW-3052 - [C++] Detect Apache ORC C++ libraries in system/conda toolchain, add to conda requirements
  • ARROW-3087 - [C++] Implement Compare filter kernel
  • ARROW-3144 - [C++/Python] Move "dictionary" member from DictionaryType to ArrayData to allow for variable dictionaries
  • ARROW-3150 - [Python] Enable Flight in Python wheels for Linux and Windows
  • ARROW-3166 - [C++] Consolidate IO interfaces used in arrow/io and parquet-cpp
  • ARROW-3191 - [Java] Make ArrowBuf work with arbitrary underlying memory
  • ARROW-3200 - [C++] Support dictionaries in Flight streams
  • ARROW-3290 - [C++] Toolchain support for secure gRPC
  • ARROW-3294 - [C++][Flight] Support Flight on Windows
  • ARROW-3314 - [R] Set -rpath using pkg-config when building
  • ARROW-3330 - [C++] Spawn multiple Flight performance servers in flight-benchmark to test parallel get performance
  • ARROW-3419 - [C++] Run include-what-you-use checks as nightly build
  • ARROW-3459 - [C++][Gandiva] Add support for variable length output vectors
  • ARROW-3475 - [C++] Allow builders to finish to the corresponding array type
  • ARROW-3570 - [Packaging] Don't bundle test data files with python wheels
  • ARROW-3572 - [Crossbow] Raise more helpful exception if Crossbow queue has an SSH origin URL
  • ARROW-3671 - [Go] implement MonthInterval and DayTimeInterval
  • ARROW-3676 - [Go] implement Decimal128 array
  • ARROW-3679 - [Go] implement read/write IPC for Decimal128
  • ARROW-3680 - [Go] implement Float16 array
  • ARROW-3686 - [Python] support masked arrays in pa.array
  • ARROW-3702 - [R] POSIXct mapped to DateType not TimestampType?
  • ARROW-3714 - [CI] Run RAT checks in pre-commit hooks
  • ARROW-3729 - [C++][Parquet] Use logical annotations in Arrow Parquet reader/writer
  • ARROW-3732 - [R] Add functions to write RecordBatch or Schema to Message value, then read back
  • ARROW-3758 - [R] Build R library and dependencies on Windows in Appveyor CI
  • ARROW-3759 - [R][CI] Build and test (no libarrow) on Windows in Appveyor
  • ARROW-3767 - [C++] Add cast from null to any other type
  • ARROW-3780 - [R] : Failed to fetch data: invalid data when collecting int16
  • ARROW-3791 - [C++ / Python] Add boolean type inference to the CSV parser
  • ARROW-3794 - [R] : Consider mapping INT8 to integer() not raw()
  • ARROW-3804 - [R] Support older versions of R runtime
  • ARROW-3810 - [R] type= argument for Array and ChunkedArray
  • ARROW-3811 - [R] : Support inferring data.frame column as StructArray in array constructors
  • ARROW-3814 - [R] RecordBatch$from_arrays()
  • ARROW-3815 - [R] : refine record batch factory
  • ARROW-3848 - [R] allow nbytes to be missing in RandomAccessFile$Read()
  • ARROW-3897 - [MATLAB] Add MATLAB support for writing numeric datatypes to a Feather file
  • ARROW-3904 - [C++/Python] Validate scale and precision of decimal128 type
  • ARROW-4013 - [Docs][C++] Add how to build on MSYS2
  • ARROW-4020 - [Release] Add a post release script to remove RC
  • ARROW-4047 - [Python] Document use of int96 timestamps and options in Parquet docs
  • ARROW-4086 - [Java] Add apis to debug memory alloc failures
  • ARROW-4121 - [C++] Refactor memory allocation from InvertKernel
  • ARROW-4159 - [C++] Build with -Wdocumentation when using clang and BUILD_WARNING_LEVEL=CHECKIN
  • ARROW-4194 - [Format][Docs] Remove duplicated / out-of-date logical type information from documentation
  • ARROW-4302 - [C++] Add OpenSSL to C++ build toolchain (#4384)
  • ARROW-4337 - [C#] Implemented Fluent API for building arrays and record batches
  • ARROW-4343 - [C++] Add docker-compose test for gcc 4.8 / Ubuntu 14.04 (Trusty), expand Xenial/16.04 Dockerfile to test Flight
  • ARROW-4356 - [CI] Add integration (docker) test for turbodbc
  • ARROW-4369 - [Packaging] Release verification script should test linux packages via docker
  • ARROW-4452 - [Python] Serialize sparse torch tensors
  • ARROW-4453 - [Python] Create Cython wrappers for SparseTensor
  • ARROW-4467 - [Rust][DataFusion] Create a REPL & Dockerfile for DataFusion
  • ARROW-4503 - [C#] Eliminate allocations in ArrowStreamReader when reading from a Stream
  • ARROW-4504 - [C++] Reduce number of C++ unit test executables from 128 to 82
  • ARROW-4505 - [C++] adding pretty print for dates, times, and timestamps
  • ARROW-4566 - [Flight] Add option to run Flight benchmark against separate server
  • ARROW-4596 - [Rust][DataFusion] Implement COUNT
  • ARROW-4622 - [C++][Python] MakeDense and MakeSparse in UnionArray should accept a vector of Field
  • ARROW-4625 - [Flight][Java] Add method to await Flight server termination in Java
  • ARROW-4626 - [Flight] Add application-defined metadata to DoGet/DoPut
  • ARROW-4627 - [Flight] Add application metadata field to DoPut
  • ARROW-4701 - [C++] Add JSON chunker benchmarks
  • ARROW-4702 - [C++] Update dependency versions
  • ARROW-4708 - [C++] add multithreaded json reader
  • ARROW-4708 - [C++] refactoring JSON parser to prepare for multithreaded impl
  • ARROW-4714 - [C++][JAVA] Providing JNI interface to Read ORC file via Arrow C++
  • ARROW-4717 - [C#] Consider exposing ValueTask instead of Task
  • ARROW-4719 - [C#] Implement ChunkedArray, Column and Table in C#
  • ARROW-4741 - [Java] Add missing type javadoc and enable checkstyle
  • ARROW-4787 - [C++] Add support for Null in MemoTable and related kernels
  • ARROW-4788 - [C++] Less verbose API for constructing StructArray
  • ARROW-4800 - [C++] Introduce a Result<T> class
  • ARROW-4805 - [Rust] Write temporal arrays to CSV
  • ARROW-4806 - [Rust] Temporal array casts
  • ARROW-4824 - [Python] Fix error checking in read_csv()
  • ARROW-4827 - [C++] Implement benchmark comparison
  • ARROW-4847 - [Python] Add pyarrow.table factory function
  • ARROW-4904 - [C++] Move implementations in arrow/ipc/test-common.h into libarrow_testing
  • ARROW-4911 - [R] Progress towards completing windows support
  • ARROW-4912 - [C++] add method for easy renaming of a Table's columns
  • ARROW-4913 - [Java][Memory] Add additional methods for observing allocations.
  • ARROW-4945 - [Flight] Enable integration tests in Travis
  • ARROW-4956 - [C#] Allow ArrowBuffers to wrap external Memory
  • ARROW-4959 - [C++][Gandiva][Crossbow] Gandiva crossbow packaging changes.
  • ARROW-4968 - [Rust] Assert that struct array field types match data in…
  • ARROW-4971 - [Go] Add type equality test function
  • ARROW-4972 - [Go] implement ArrayEquals
  • ARROW-4973 - [Go] implement ArraySliceEqual
  • ARROW-4974 - [Go] implement ArrayApproxEqual
  • ARROW-4990 - [C++] Support Array-Array comparison
  • ARROW-4993 - [C++] Add simple build configuration summary
  • ARROW-5000 - [Python] Fix 'SO' DeprecationWarning in setup.py
  • ARROW-5007 - [C++] Remove DCHECK in intrinsic headers
  • ARROW-5020 - [CI] Split Gandiva-related packages into separate .yml file
  • ARROW-5027 - [Python] Python bindings for JSON reader
  • ARROW-5037 - [Rust] [DataFusion] Refactor aggregate module
  • ARROW-5038 - [Rust][DataFusion] Implement AVG aggregate function
  • ARROW-5039 - [Rust][DataFusion] Re-implement CAST support
  • ARROW-5040 - [C++] ArrayFromJSON can't parse Timestamp from strings
  • ARROW-5045 - [Rust] Code coverage silently failing in CI
  • ARROW-5053 - [Rust][DataFusion] Use ARROW_TEST_DATA env var
  • ARROW-5054 - [Release][Flight] Test Flight in Linux/macOS release verification scripts
  • ARROW-5056 - [Packaging] Adjust conda recipes to use ORC conda-forge package on unix systems
  • ARROW-5061 - [Release] Improve 03-binary performance
  • ARROW-5062 - [Java][FlightRPC] Shade com.google.guava usage in Flight
  • ARROW-5063 - [FlightRPC][Java] Test that Flight client connections are independent
  • ARROW-5064 - [Release] Pass PKG_CONFIG_PATH to glib in the verification script
  • ARROW-5066 - [Integration] Add flags to enable/disable implementations in integration/integration_test.py
  • ARROW-5071 - [Archery] Implement running benchmark suite
  • ARROW-5076 - [Release] Improve post binary upload performance
  • ARROW-5077 - [Rust] Change Cargo.toml to use release versions
  • ARROW-5078 - [Documentation] Sphinx is failed by RemovedInSphinx30Warning
  • ARROW-5079 - [Release] Add a script that releases C# package
  • ARROW-5080 - [Release] Add a script that releases Rust packages
  • ARROW-5081 - [C++] Use PATH_SUFFIXES when searching for dependencies
  • ARROW-5083 - [Developer] PR merge script improvements: set already-released Fix Version, display warning when no components set
  • ARROW-5088 - [C++] Only add -Werror in debug builds. Add C++ documentation about compiler warning levels
  • ARROW-5091 - [Flight] Rename FlightGetInfo message to FlightInfo
  • ARROW-5093 - [Packaging] Add support for selective binary upload
  • ARROW-5094 - [Packaging] Add APT/Yum verification scripts
  • ARROW-5102 - [C++] Reduce header dependencies
  • ARROW-5108 - [Go] implement reading primitive arrays from Arrow file
  • ARROW-5109 - [Go] implement reading binary/string arrays from Arrow file
  • ARROW-5110 - [Go] implement reading struct arrays from Arrow file
  • ARROW-5111 - [Go] implement reading list arrays from Arrow file
  • ARROW-5112 - [Go] implement writing IPC Arrow stream/file
  • ARROW-5113 - [C++] Fix DoPut with dictionary arrays, add tests
  • ARROW-5115 - [JS] Add Vector Builders and high-level stream primitives
  • ARROW-5116 - [Rust] move kernel related files under compute/kernels
  • ARROW-5124 - [C++] Add support for Parquet in MinGW build
  • ARROW-5126 - [Rust][Parquet] Convert parquet column desc to arrow data type
  • ARROW-5127 - [Rust][Parquet] Add page iterator.
  • ARROW-5136 - [Flight] Call options
  • ARROW-5137 - [Flight] Implement auth API
  • ARROW-5145 - [C++] More input validation in release mode
  • ARROW-5150 - [Ruby] Add Arrow::Table#raw_records
  • ARROW-5155 - [GLib][Ruby] Add support for building union arrays from data type
  • ARROW-5157 - [Website] Add MATLAB to powered by Apache Arrow website
  • ARROW-5162 - [Rust][Parquet] Rename mod reader to arrow.
  • ARROW-5163 - [Gandiva] Cast timestamp/date are incorrectly evaluating year 0097 to 1997
  • ARROW-5164 - [Gandiva][C++] Introduce murmur32 for 32 bit types.
  • ARROW-5165 - [Python] update dev installation docs for --build-type + validate in setup.py
  • ARROW-5168 - [GLib] Add garrow_array_take()
  • ARROW-5171 - [C++] Use LESS instead of LOWER in compare enum
  • ARROW-5172 - [Go] implement reading fixed-size binary arrays from Arrow file
  • ARROW-5178 - [Python] Add Table.from_pydict()
  • ARROW-5179 - [Python] Return plain dicts, not OrderedDict, on Python 3.7+
  • ARROW-5185 - [C++] Add support for Boost with CMake configuration file
  • ARROW-5187 - [Rust] Add ability to convert StructArray to RecordBatch
  • ARROW-5188 - [Rust] Add temporal types to struct builders
  • ARROW-5189 - [Rust][Parquet] Format / display individual fields within a parquet row
  • ARROW-5190 - [R] : Discussion: tibble dependency in R package
  • ARROW-5191 - [Rust] Expose CSV and JSON reader schemas
  • ARROW-5203 - [GLib] Add support for Compare filter
  • ARROW-5204 - [C++] Improve builder performance
  • ARROW-5212 - [Go] Support reserve for the data buffer in the BinaryBuilder
  • ARROW-5218 - [C++] Improve build when third-party library locations are specified
  • ARROW-5219 - [C++] Build protobuf_ep in parallel when using Ninja build
  • ARROW-5222 - [Python] Revise pyarrow installation instructions for macOS
  • ARROW-5225 - [Java] Improve performance of BaseValueVector#getValidityBufferSizeFromCount
  • ARROW-5226 - [Gandiva] Add cmp functions for decimals
  • ARROW-5238 - [Python] Convert arguments to pyarrow.dictionary
  • ARROW-5241 - [Python] expose option to disable writing statistics to parquet file
  • ARROW-5250 - [Java] Add javadoc comments to public methods, remove style check suppression.
  • ARROW-5252 - [C++] Use standard-compliant std::variant backport
  • ARROW-5256 - [C++] Add support for LLVM 7.1
  • ARROW-5257 - [Website] Update site to use "official" Apache Arrow logo, add clearly marked links to logo
  • ARROW-5258 - [C++/Python] Collect file metadata of dataset pieces
  • ARROW-5261 - [C++] Add missing scalar defintions for Intervals
  • ARROW-5262 - [Python] Fix typo
  • ARROW-5264 - [Java] Allow enabling/disabling boundary checking by environmental variable
  • ARROW-5266 - [Go] implement read/write IPC for Float16
  • ARROW-5268 - [GLib] Add GArrowJSONReader
  • ARROW-5269 - [C++][Archery] Mark relevant benchmarks as regression
  • ARROW-5275 - [C++] Generic filesystem tests
  • ARROW-5281 - [Rust] Extract DataPageBuilder to test common
  • ARROW-5284 - [Rust] Replace libc with std::alloc for memory allocation
  • ARROW-5286 - [Python] support struct type in from_pandas
  • ARROW-5288 - [Documentation] Enhance the contribution guidelines page
  • ARROW-5289 - [C++] Move arrow/util/concatenate* to arrow/array
  • ARROW-5290 - [Java] Provide a flag to enable/disable null-checking in vector's get methods
  • ARROW-5291 - [Python] Add wrapper for take kernel on Array
  • ARROW-5298 - [Rust] Add debug implementation for buffer data.
  • ARROW-5299 - [C++] ListArray comparison is incorrect
  • ARROW-5309 - [Python] clarify that Schema.append returns new object
  • ARROW-5311 - [C++] use more specific error status types in take
  • ARROW-5313 - [Format] Comments on Field table are a bit confusing
  • ARROW-5317 - [Rust][Parquet] impl IntoIterator for SerializedFileReader
  • ARROW-5319 - [C++][CI][travis skip]
  • ARROW-5321 - [Gandiva][C++] add isnull impl for string types
  • ARROW-5323 - [CI][skip travis]
  • ARROW-5328 - [R] Add shell scripts to do a full package rebuild and test locally
  • ARROW-5329 - [MATLAB] Add support for building MATLAB interface to Feather directly within MATLAB
  • ARROW-5334 - [C++] Ensure all type classes end with "Type"
  • ARROW-5335 - [Python] Raise exception on variable dictionaries in conversion to Python/pandas
  • ARROW-5339 - [C++] Add jemalloc URL to thirdparty/versions.txt so download_dependencies.sh gets it
  • ARROW-5341 - [C++][Documentation] developers/cpp.rst should mention documentation warnings
  • ARROW-5342 - [Format] Formalize "extension types" in Arrow protocol metadata
  • ARROW-5346 - [C++] Revert changed to vendored datetime library
  • ARROW-5349 - [C++][Parquet] Add method to set file path in a parquet::FileMetaData instance
  • ARROW-5361 - [R] Follow DictionaryType/DictionaryArray changes from ARROW-3144
  • ARROW-5363 - [GLib] Fix coding styles
  • ARROW-5364 - [C++] Use ASCII rather than UTF-8 in BuildUtils.cmake comment
  • ARROW-5365 - [C++][CI] Enable ASAN/UBSAN in CI
  • ARROW-5368 - [C++] Disable jemalloc by default with MinGW
  • ARROW-5369 - [C++] Add support for glog on Windows
  • ARROW-5370 - [C++] Use system uriparser if available
  • ARROW-5372 - [GLib] Add support for null/boolean values CSV read option
  • ARROW-5378 - [C++] Local filesystem implementation
  • ARROW-5384 - [Go] implement FixedSizeList array
  • ARROW-5389 - [C++] Add Temporary Directory facility
  • ARROW-5392 - [C++][CI] Disable static build with MinGW on AppVeyor
  • ARROW-5393 - [R] Add tests and example for read_parquet()
  • ARROW-5395 - [C++] Utilize stream EOS in File format
  • ARROW-5396 - [JS] Support files and streams with no record batches
  • ARROW-5401 - [CI][skip appveyor]
  • ARROW-5404 - [C++] force usage of nonstd::sv_lite::string_view instead of std::string_view
  • ARROW-5407 - [C++] Allow building only integration test targets
  • ARROW-5413 - [C++] Skip UTF8 BOM in CSV files
  • ARROW-5415 - [Release] Release script should update R version everywhere
  • ARROW-5416 - [Website] Add Homebrew to project installation page
  • ARROW-5418 - [CI][R] Run code coverage and report to codecov.io
  • ARROW-5420 - [Java] Implement or remove getCurrentSizeInBytes in Variab…
  • ARROW-5427 - [Python] pandas conversion preserve_index=True to force RangeIndex serialization
  • ARROW-5428 - [C++] Add option to set "read extent" in arrow::io::BufferedInputStream
  • ARROW-5429 - [Java] Provide alternative buffer allocation policy
  • ARROW-5432 - [Python] Add NativeFile.read_at()
  • ARROW-5433 - [C++][Parquet] Improve parquet-reader columns information, strip trailing whitespace from test case
  • ARROW-5434 - [Memory][Java] Introduce wrappers for backward compatibility.
  • ARROW-5436 - [Python] parquet.read_table add filters keyword
  • ARROW-5438 - [JS] EOS bytes for sequential readers
  • ARROW-5441 - [C++] Implement FindArrowFlight.cmake
  • ARROW-5442 - [Website] Clarify what makes a release artifact "official"
  • ARROW-5443 - [Crossbow] Turn parquet build off for Gandiva.
  • ARROW-5447 - [Ruby] Ensure flushing test gz file
  • ARROW-5449 - [C++] Test extended-length paths on Windows
  • ARROW-5451 - [C++][Gandiva] Support cast/round functions for decimal
  • ARROW-5452 - [R] Add API documentation website (pkgdown)
  • ARROW-5461 - [Java] Add micro-benchmarks for Float8Vector and allocators
  • ARROW-5463 - [Rust] Add AsRef trait for Buffer.
  • ARROW-5464 - [Archery] Fix default diff --benchmark-filter
  • ARROW-5465 - [Crossbow] Support writing submitted job definition yaml to a file
  • ARROW-5466 - [Java] Dockerize Java builds in Travis CI, run multiple JDKs in single entry
  • ARROW-5467 - [Go] implement read/write IPC for Time32/64 arrays
  • ARROW-5468 - [Go] implement read/write IPC for Timestamp arrays
  • ARROW-5469 - [Go] implement read/write IPC for Date32/64 arrays
  • ARROW-5470 - [CI] Fix Travis-CI R job that broke with the local fs patch
  • ARROW-5472 - [Development] Add warning to PR merge tool if no JIRA component is set
  • ARROW-5474 - [C++] Document Boost 1.58 as minimum supported version, add docker-compose entry for it, fix broken cpp/Dockerfile* builds
  • ARROW-5475 - [Python] Add Python binding for arrow::Concatenate
  • ARROW-5476 - [Java][Memory] Fix Netty Arrow Buf.
  • ARROW-5477 - [C++] Check required RapidJSON version
  • ARROW-5478 - [Packaging] Drop Ubuntu 14.04 support
  • ARROW-5481 - [GLib] Add "error" parameter document
  • ARROW-5485 - [C++] Install libraries from googletest_ep into build output directory on non-Windows platforms.
  • ARROW-5485 - [Crossbow] Disable unit tests in Gandiva macOS crossbow job until underlying issue resolved
  • ARROW-5486 - [GLib] Add binding of gandiva::FunctionRegistry and related things
  • ARROW-5488 - [R] Workaround when C++ lib not available
  • ARROW-5490 - [C++] Remove ARROW_BOOST_HEADER_ONLY
  • ARROW-5491 - [C++] Remove unecessary semicolons following MACRO definitions
  • ARROW-5492 - [R] Add "col_select" argument to read_* functions to read subset of columns
  • ARROW-5495 - [C++] Update some dependency URLs from http to https
  • ARROW-5496 - [R][CI] Fix relative paths in R codecov.io reporting
  • ARROW-5498 - [C++][CI] Fix Flatbuffers related error with MinGW
  • ARROW-5499 - [R] Alternate bindings for when libarrow is not found
  • ARROW-5500 - [R] read_csv_arrow() signature should match readr::read_csv()
  • ARROW-5503 - [R] : add read_json()
  • ARROW-5504 - [R] : move use_threads argument to global option
  • ARROW-5509 - [R] Add basic write_parquet
  • ARROW-5511 - [Packaging] Enable Flight in Conda packages
  • ARROW-5512 - [C++] Rough API skeleton for C++ Datasets API / framework
  • ARROW-5513 - [Java] Refactor method name for getstartOffset to use camel case
  • ARROW-5516 - [Python][Documentation] Development page for pyarrow has a missing dependency in using pip
  • ARROW-5518 - [Java] Set VectorSchemaRoot rowCount to 0 on allocateNew and clear
  • ARROW-5524 - [C++] Turn off PARQUET_BUILD_ENCRYPTION in CMake if OpenSSL not found (#4494)
  • ARROW-5526 - [GitHub] Add more prominent notice to ISSUE_TEMPLATE.md to direct bug reports to JIRA
  • ARROW-5529 - [Flight] Allow serving with multiple TLS certificates
  • ARROW-5531 - [Python] Implement Array.from_buffers for varbinary and nested types, add DataType.num_buffers property
  • ARROW-5533 - [C++][Plasma] make plasma client thread safe
  • ARROW-5534 - [GLib] Add garrow_table_concatenate()
  • ARROW-5535 - [GLib] Add garrow_table_slice()
  • ARROW-5537 - [JS] Support delta dictionaries in RecordBatchWriter and DictionaryBuilder
  • ARROW-5538 - [C++] Restrict minimum OpenSSL version to 1.0.2
  • ARROW-5541 - [R] : cast from negative int32 to uint32 and uint64 are now safe
  • ARROW-5544 - [Archery] Don't return non-zero on regressions
  • ARROW-5545 - [C++][Docs] Clarify expectation of UTC values for timestamps with time zones
  • ARROW-5547 - [C++][FlightRPC] Support pkg-config for Arrow Flight
  • ARROW-5552 - [Go] make Schema, Field and simpleRecord implement Stringer
  • ARROW-5554 - [Python] Added a python wrapper for arrow::Concatenate()
  • ARROW-5555 - [R] Add install_arrow() function to assist the user in obtaining C++ runtime libraries
  • ARROW-5556 - [Doc][Python] Document JSON reader
  • ARROW-5557 - [C++] Add VisitBits benchmark
  • ARROW-5565 - [Python][Docs] Add instructions how to use gdb to debug C++ libraries when running Python unit tests
  • ARROW-5567 - [C++] Fix build error of memory-benchmark
  • ARROW-5571 - [R] Rework handing of ARROW_R_WITH_PARQUET
  • ARROW-5574 - [R] documentation error for read_arrow()
  • ARROW-5581 - [Java] Provide interfaces and initial implementations for vector sorting
  • ARROW-5582 - [Go] implement RecordEqual
  • ARROW-5586 - [R] convert Array of LIST type to R lists
  • ARROW-5587 - [Java] Add more style check rule for Java code
  • ARROW-5590 - [R] Run "no libarrow" R build in the same CI entry if possible
  • ARROW-5591 - [Go] implement read/write IPC for Duration & Intervals
  • ARROW-5597 - [Packaging] Add Flight deb packages
  • ARROW-5600 - [R] R package namespace cleanup
  • ARROW-5602 - [Java][Gandiva] Add tests for round/cast
  • ARROW-5604 - [Go] improve coverage of TypeTraits
  • ARROW-5609 - [C++] Set CMP0068 CMake policy to avoid macOS warnings
  • ARROW-5612 - [Python][Doc] Add prominent note that date_as_object option changed with Arrow 0.13
  • ARROW-5621 - [Go] implement read/write IPC for Decimal128 arrays
  • ARROW-5622 - [C++][Dataset] Support pkg-config for Arrow Datasets
  • ARROW-5625 - [R] convert Array of struct type to data frame columns
  • ARROW-5632 - [Doc] Basic instructions for using Xcode with Arrow
  • ARROW-5633 - [Python] Enable bz2 in Linux wheels
  • ARROW-5635 - [C++] Added a Compact() method to Table.
  • ARROW-5637 - [Java][C++][Gandiva] Complete In Expression Support
  • ARROW-5639 - [Java] Remove floating point computation from getOffsetBufferValueCapacity
  • ARROW-5641 - [GLib] Remove enums files generated by GNU Autotools from Git targets
  • ARROW-5643 - [FlightRPC] Add ability to override SSL hostname checking
  • ARROW-5650 - [Python] Update manylinux dependency versions
  • ARROW-5652 - [CI] Fix lint docker image
  • ARROW-5653 - [CI] Fix cpp docker image
  • ARROW-5656 - [Python][Packaging] Fix macOS wheel builds, add Flight support
  • ARROW-5659 - [C++] Add support for finding OpenSSL installed by Homebrew
  • ARROW-5660 - [GLib][CI] Use Xcode 10.2
  • ARROW-5661 - [Gandiva][C++] support hash functions for decimals in gandiva
  • ARROW-5662 - [C++] Add support for BOOST_SOURCE=AUTO|BUNDLED|SYSTEM
  • ARROW-5663 - [Packaging][RPM] Update CentOS packages for 0.14.0
  • ARROW-5664 - [Crossbow] Execute nightly crossbow tests on CircleCI instead of Travis
  • ARROW-5668 - [C++/Python] Include 'not null' in schema fields pretty print
  • ARROW-5669 - [Python][Packaging] Add ARROW_TEST_DATA env variable to Crossbow Linux Wheel build
  • ARROW-5670 - [Crossbow] get_apache_mirror.py fails with TLS error on macOS with Python 3.5
  • ARROW-5671 - [crossbow] mac os python wheels failing
  • ARROW-5672 - [Java] Refactor redundant method modifier
  • ARROW-5683 - [R] Add snappy to Rtools Windows builds
  • ARROW-5684 - [Packaging][deb] Add support for Ubuntu 19.04
  • ARROW-5685 - [Packaging][deb] Add support for Apache Arrow Datasets
  • ARROW-5687 - [C++] Remove remaining uses of ARROW_BOOST_VENDORED
  • ARROW-5690 - [Packaging][Python] Fix macOS wheel building
  • ARROW-5694 - [Python] Support list of Decimals in conversion to pandas
  • ARROW-5695 - [C#][Release] Run sourcelink test in verify-release-candidate.sh
  • ARROW-5696 - [C++][Gandiva] Introduce castVarcharVarchar
  • ARROW-5699 - [C++] Optimize decimal128 parsing
  • ARROW-5701 - [C++][Gandiva] Build expr with specific sv
  • ARROW-5702 - [C++] parquet::arrow::FileReader::GetSchema()
  • ARROW-5704 - [C++] Stop using ARROW_TEMPLATE_EXPORT for SparseTensorImpl
  • ARROW-5705 - [Java] Optimize BaseValueVector#computeCombinedBufferSize logic
  • ARROW-5706 - [Java] Remove type conversion in getValidityBufferValueCapacity
  • ARROW-5707 - [Java] Improve the performance and code structure for ArrowRecordBatch
  • ARROW-5710 - [C++] Allow compiling Gandiva with Ninja on Windows
  • ARROW-5715 - [Release] Verify Ubuntu 19.04 APT repository
  • ARROW-5718 - [R] auto splice data frames in record_batch() and table()
  • ARROW-5720 - [C++] Create benchmarks for decimal related classes.
  • ARROW-5721 - [Rust] Move array related code into a separate module
  • ARROW-5724 - [R][CI] AppVeyor build should use ccache
  • ARROW-5725 - [Crossbow] Port conda recipes to azure pipelines
  • ARROW-5726 - [Java] Implement a common interface for int vectors
  • ARROW-5727 - [Python][CI] Install pytest-faulthandler before running tests
  • ARROW-5748 - [Packaging][deb] Add support for Debian GNU/Linux buster
  • ARROW-5749 - [Python] Added python binding for Table::CombineChunks
  • ARROW-5751 - [Python][Packaging] Ensure that c-ares is linked statically in Python wheels
  • ARROW-5752 - [Java] Improve the performance of ArrowBuf#setZero
  • ARROW-5755 - [Rust][Parquet] Derive clone for Type.
  • ARROW-5768 - [Release] Remove needless empty lines at the end of CHANGELOG.md
  • ARROW-5773 - [R] Clean up documentation before release
  • ARROW-5780 - [C++] Add benchmark for Decimal operations
  • ARROW-5782 - [Release] Setup test data for Flight in dev/release/01-perform.sh
  • ARROW-5783 - [Release][C#] Exclude dummy.git from RAT check
  • ARROW-5785 - [Rust] Rust datafusion implementation should not depend on rustyline
  • ARROW-5787 - [Release][Rust] Use local modules to verify RC
  • ARROW-5793 - [Release] Avoid duplicate known host SSH error in dev/release/03-binary.sh
  • ARROW-5794 - [Release] Skip uploading already uploaded binaries
  • ARROW-5795 - [Release] Add missing waits on uploading binaries
  • ARROW-5796 - [Release][APT] Update expected package list
  • ARROW-5797 - [Release][APT] Update supported distributions
  • ARROW-5818 - [Java][Gandiva] support varlen output vectors
  • ARROW-5820 - [Release] Remove undefined variable check from verify script
  • ARROW-5826 - [Website] Blog post for 0.14.0 release announcement
  • PARQUET-1243 - [C++] Throw more informative exception when reading a length-0 Parquet file
  • PARQUET-1411 - [C++] Add parameterized logical annotations to Parquet metadata
  • PARQUET-1422 - [C++] Use common Arrow IO interfaces throughout codebase
  • PARQUET-1517 - [C++] Crypto package updates to match the final spec
  • PARQUET-1523 - [C++] Vectorize Comparator interface, remove virtual calls on inner loop. Refactor Statistics to not require PARQUET_EXTERN_TEMPLATE
  • PARQUET-1569 - [C++] Consolidate shared unit testing header files
  • PARQUET-1582 - [C++] Add ToString method to ColumnDescriptor
  • PARQUET-1583 - [C++] Remove superfluous parquet::Vector class
  • PARQUET-1586 - [C++] Add --dump options to parquet-reader tool to dump def/rep levels
  • PARQUET-1603 - [C++] rename parquet::LogicalType to parquet::ConvertedType

Bug Fixes

  • ARROW-61 - [Java] Method can return the value bigger than long MAX_VALUE
  • ARROW-352 - [Format] Interval(DAY_TIME) has no unit
  • ARROW-1837 - [Java][Integration] Fix unsigned round trip integration tests
  • ARROW-2119 - [IntegrationTest] Add test case with a stream having no record batches
  • ARROW-2136 - [Python] Check null counts for non-nullable fields when converting from pandas.DataFrame with supplied schema
  • ARROW-2256 - [C++] Fix libfuzzer builds for clang-7
  • ARROW-2461 - [Python] Build manylinux2010 wheels
  • ARROW-2590 - [Python] Pyspark python_udf serialization error on grouped map (Amazon EMR)
  • ARROW-3344 - [Python] Disable flaky Plasma test
  • ARROW-3399 - [Python] Implementing numpy matrix serialization
  • ARROW-3650 - [Python] warn on converting DataFrame with mixed type column names
  • ARROW-3801 - [Python] Pandas-Arrow roundtrip makes pd categorical index not writeable
  • ARROW-4021 - [Ruby] Error building red-arrow on msys2
  • ARROW-4076 - [Python] Validate ParquetDataset schema after filtering
  • ARROW-4139 - [Python][Parquet] Wrap new parquet::LogicalType, cast min/max statistics based on LogicalType
  • ARROW-4301 - [Java] use arrow-jni profile for both gandiva/orc
  • ARROW-4301 - [Java][Gandiva] Update version manually
  • ARROW-4324 - [Python] Triage broken type inference logic in presence of a mix of NumPy dtype-having objects and other scalar values
  • ARROW-4350 - [Python] Fix conversion from Python to Arrow with nested lists and NumPy dtype=object items
  • ARROW-4433 - [R] Segmentation fault when instantiating arrow::table from data frame
  • ARROW-4447 - [C++] Investigate dynamic linking for libthift
  • ARROW-4516 - [Python] Error while creating a ParquetDataset on a path without `_common_dataset` but with an empty `_tempfile`
  • ARROW-4523 - [JS] Add row proxy generation benchmark
  • ARROW-4651 - [Flight] Use URIs instead of host/port pair
  • ARROW-4665 - [C++] With glog activated, DCHECK macros are redefined
  • ARROW-4675 - [Python] Fix pyarrow.deserialize failure when reading payload in Python 3 payload generated in Python 2
  • ARROW-4694 - [CI] Improve detect-changes.py on Travis PRs
  • ARROW-4723 - [Python] Ignore "hidden" files that starts with underscore
  • ARROW-4725 - [C++] Enable dictionary builder tests with MinGW build
  • ARROW-4823 - [C++][Python] Do not close raw file handle in ReadaheadSpooler, check that file handles passed to read_csv are not closed
  • ARROW-4832 - [Python] pandas Index metadata for RangeIndex is incorrect
  • ARROW-4845 - [R] Compiler warnings on Windows MingW64
  • ARROW-4851 - [Java] BoundsChecking.java defaulting behavior for old drill parameter seems off
  • ARROW-4877 - [Plasma] CI failure in test_plasma_list
  • ARROW-4884 - [C++] conda-forge thrift-cpp package not available via pkg-config or cmake
  • ARROW-4885 - [C++/Python] Enable Decimal parsing in CSV
  • ARROW-4886 - [Rust] Cast to list with offset
  • ARROW-4923 - [Java] Add methods to set long value at given index in DecimalVector
  • ARROW-4934 - [Python] Address deprecation notice that will be a bug in Python 3.8
  • ARROW-5019 - [C#] ArrowStreamWriter doesn't work on a non-seekable stream
  • ARROW-5049 - [Python] org/apache/hadoop/fs/FileSystem class not found when pyarrow FileSystem used in spark
  • ARROW-5051 - [GLib][Gandiva] Don't return temporary memory
  • ARROW-5055 - [Ruby][MSYS2] libparquet needs to be installed in MSYS2 for ruby
  • ARROW-5058 - [Release] Fix typos in vote e-mail template
  • ARROW-5059 - [C++][Gandiva] cbrt_* floating point tests can fail due to exact comparisons
  • ARROW-5065 - [Rust] cast kernel does not support casting from Int64
  • ARROW-5068 - [Gandiva][Packaging] Fix gandiva nightly builds after the CMake refactor
  • ARROW-5090 - Parquet linking fails on MacOS due to @rpath in dylib
  • ARROW-5092 - [C#] Create a dummy .git directory to download the source files from GitHub with Source Link
  • ARROW-5095 - [Flight][C++] Expose server error message in DoGet
  • ARROW-5096 - [Packaging][deb] Add missing plasma-store-server packages
  • ARROW-5097 - [Packaging][CentOS6] Remove needless dependencies
  • ARROW-5098 - [Website] Update how to install .deb by APT
  • ARROW-5100 - [JS] Remove swap while collapsing contiguous buffers
  • ARROW-5117 - [Go] fix panic when nil or empty slices are appended to builders
  • ARROW-5119 - [Go] fix Boolean stringer implementation
  • ARROW-5122 - [Python] pyarrow.parquet.read_table raises non-file path error when given a windows path to a directory
  • ARROW-5128 - [Packaging][CentOS][Conda] Numpy not found in nightly builds
  • ARROW-5129 - [Rust] Column writer bug: check dictionary encoder when adding a new data page
  • ARROW-5130 - [C++][Python] Limit exporting of std::* symbols
  • ARROW-5132 - [Java] Errors on building gandiva_jni.dll on Windows with Visual Studio 2017
  • ARROW-5138 - [Python] Add documentation about pandas preserve_index option
  • ARROW-5140 - [Bug?][Parquet] Can write a jagged array column of strings to disk, but hit `ArrowNotImplementedError` on read
  • ARROW-5142 - , ARROW-5732, ARROW-5735: [CI] Emergency fixes
  • ARROW-5144 - [Python] ParquetDataset and ParquetPiece not serializable
  • ARROW-5146 - [Dev] Fix project name inference in merge script
  • ARROW-5147 - [C++] Add missing dependencies to Brewfile
  • ARROW-5148 - [Gandiva] Allow linking with RTTI-disabled LLVM builds
  • ARROW-5149 - [Packaging][Wheel] Pin LLVM to version 7 in windows builds
  • ARROW-5152 - [Python] Fix CMake warnings
  • ARROW-5159 - [Rust] Unable to build benches in arrow crate.
  • ARROW-5160 - [C++] Don't evaluate expression twice in ABORT_NOT_OK
  • ARROW-5166 - [Python][Parquet] Statistics for uint64 columns may overflow
  • ARROW-5167 - [C++] Upgrade string-view-light to latest
  • ARROW-5169 - [Python] preserve field nullability of specified schema in Table.from_pandas
  • ARROW-5173 - [Go] handle multiple concatenated record batches
  • ARROW-5174 - [Go] implement Stringer for DataTypes
  • ARROW-5177 - [C++/Python] Check column index when reading Parquet column
  • ARROW-5183 - [CI] Fix AppVeyor failure
  • ARROW-5184 - [Rust] Broken links and other documentation warnings
  • ARROW-5186 - [Plasma] Fix crash caused by improper free on CUDA memory
  • ARROW-5194 - [C++][Plasma] TEST(PlasmaSerialization, GetReply) is failing
  • ARROW-5195 - [C++] Detect null strings in CSV string columns
  • ARROW-5201 - [Python] handle collections.abc deprecation warnings
  • ARROW-5208 - [Python] Add mask argument to pyarrow.infer_type, do not look at masked values when inferring output type in pyarrow.array
  • ARROW-5214 - [C++] Fix thirdparty download script
  • ARROW-5217 - [Rust][DataFusion] Fix failing tests
  • ARROW-5232 - [Java] Avoid runaway doubling of vector size
  • ARROW-5233 - [Go] Migrate to flatbuffers-v1.11.0
  • ARROW-5237 - [Python] populate _pandas_api.version
  • ARROW-5240 - [C++][CI] pin cmake_format
  • ARROW-5242 - [C++] Update vendored HowardHinnant/date to master
  • ARROW-5243 - [Java][Gandiva] Add decimal compare tests
  • ARROW-5245 - [CI][C++] Unpin cmake format (current version is 5.1)
  • ARROW-5246 - [Go] use Go-1.12.x in CI
  • ARROW-5249 - [Java] Add auth capability to Flight async operations (#4238)
  • ARROW-5253 - [C++] Fix snappy external build
  • ARROW-5254 - [Flight][Java] Change Flight doAction to allow multiple responses in Java
  • ARROW-5255 - [Java] Proof-of-concept of Java extension types
  • ARROW-5260 - [Python] Fix crash when deserializating from components in another process
  • ARROW-5274 - [JavaScript] Wrong array type for countBy
  • ARROW-5283 - [C++][Plasma] Erase object id in client when abort object
  • ARROW-5285 - [C++][Plasma] Implement to release GpuProcessHandle
  • ARROW-5293 - [C++] Take kernel on DictionaryArray does not preserve ordered flag
  • ARROW-5294 - [Python][CI] Fix manylinux1 build
  • ARROW-5296 - [Java] Ignore timeout-based Flight tests for now
  • ARROW-5301 - [Python] update parquet docs on multithreading
  • ARROW-5304 - [C++] fix thread-safe on CudaDeviceManager::GetInstance
  • ARROW-5306 - [CI][GLib] Disable GTK-Doc
  • ARROW-5308 - [Go] remove deprecated Feather format
  • ARROW-5314 - [Go] fix bug for String Arrays with offset
  • ARROW-5314 - [Go] Fix bug for FixedSizeBinary with offset
  • ARROW-5318 - [Python] pyarrow hdfs reader overrequests
  • ARROW-5325 - [Archery][Benchmark] Output properly formatted jsonlines from benchmark diff cli command
  • ARROW-5330 - [CI][skip appveyor]
  • ARROW-5332 - [R] Update R package README with richer installation instructions
  • ARROW-5348 - [Java][CI] Add missing gandiva javadoc
  • ARROW-5360 - [Rust] Update rustyline to fix build
  • ARROW-5362 - [C++] Fix compression test memory usage
  • ARROW-5371 - [Release] Add tests for dev/release/00-prepare.sh
  • ARROW-5373 - [Java] Add missing details for Gandiva Java Build
  • ARROW-5376 - [C++] Workaround for gcc 5.4.0 bug
  • ARROW-5383 - [Go] Update flatbuf for new Duration type
  • ARROW-5387 - [Go] properly handle sub-slice of List
  • ARROW-5388 - [Go] use arrow.TypeEquals in array.NewChunked
  • ARROW-5390 - [CI][skip appveyor]
  • ARROW-5397 - [FlightRPC] Add TLS certificates for testing Flight
  • ARROW-5398 - [Python] Fix Flight tests
  • ARROW-5403 - [C++] Use GTest shared libraries with BUNDLED build, always use BUNDLED with MSVC
  • ARROW-5411 - [C++][Python] Build error building on Mac OS Mojave
  • ARROW-5412 - [Integration] Add Java option for netty reflection
  • ARROW-5419 - [C++] Allow recognizing empty strings as null strings in CSV files
  • ARROW-5421 - [Packaging][Crossbow] Duplicated key in nightly test configuration
  • ARROW-5422 - [CI] [C++] Build failure with Google Benchmark
  • ARROW-5430 - [Python] Raise ArrowInvalid for pyints larger than int64
  • ARROW-5435 - [Java] Add test for IntervalYearVector#getAsStringBuilder
  • ARROW-5437 - [Python] Missing pandas pytest marker from parquet tests
  • ARROW-5446 - [C++][CMake] Install arrow/util/config.h into CMAKE_INSTALL_INCLUDEDIR
  • ARROW-5448 - [C++][CI][MinGW][skip travis]
  • ARROW-5453 - [C++] Update to cmake-format=0.5.2 and pin again
  • ARROW-5455 - [Rust] Build broken by 2019-05-30 Rust nightly
  • ARROW-5456 - [GLib][Plasma] Fix dependency order on building document
  • ARROW-5457 - [GLib][Plasma] Fix environment variable name for test
  • ARROW-5459 - [Go] implement Stringer for float16 DataType
  • ARROW-5462 - [Go] support writing zero-length List arrays
  • ARROW-5479 - [Rust][DataFusion] Use ARROW_TEST_DATA instead of relative path for testing
  • ARROW-5487 - [Docs] Fix Sphinx failure
  • ARROW-5493 - [Go][Integration] add Go support for IPC integration tests
  • ARROW-5507 - [Plasma][CUDA] Fix compile error
  • ARROW-5514 - [C++] Fix pretty-printing uint64 values
  • ARROW-5517 - [C++] Only check header basename for 'internal' when collecting public headers
  • ARROW-5520 - [Packaging][deb] Add support for building on arm64
  • ARROW-5521 - [Packaging] Use Apache RAT 0.13
  • ARROW-5528 - [C++] Fixed a bug when Concatenate() arrays with no value buffers.
  • ARROW-5532 - [JS] Field Metadata Not Read
  • ARROW-5551 - [Go] implement FixedSizeArrays with 2-buffers layout
  • ARROW-5553 - [Ruby] Use the official packages to install Apache Arrow
  • ARROW-5576 - [C++] Query ASF mirror system for URL and use when downloading Thrift
  • ARROW-5577 - [C++][Alpine] Correct googletest shared library paths on non-Windows to fix Alpine build
  • ARROW-5583 - [Java] When the isSet of a NullableValueHolder is 0, the buffer field should not be used
  • ARROW-5584 - [Java] Add import for link reference in FieldReader javadoc
  • ARROW-5589 - [C++] Add missing nullptr check during flatbuffer decoding
  • ARROW-5592 - [Go] implement Duration array
  • ARROW-5596 - [Python] Fix Python-3 syntax only in test_flight.py
  • ARROW-5601 - [C++][Gandiva] fail if the output type is not supported
  • ARROW-5603 - [Python] Register custom pytest markers to avoid warnings
  • ARROW-5605 - [C++] Verify Flatbuffer messages in more places to prevent crashes due to bad inputs
  • ARROW-5606 - [Python] deal with deprecated RangeIndex._start/_stop/_step
  • ARROW-5608 - [C++][parquet] Fix invalid memory access when using parquet::arrow::ColumnReader
  • ARROW-5615 - [C++] gcc 5.4.0 doesn't want to parse inline C++11 string R literal
  • ARROW-5616 - [C++][Python] Fix -Wwrite-strings warning when building against Python 2.7 headers
  • ARROW-5617 - [C++] thrift_ep 0.12.0 fails to build when using ARROW_BOOST_VENDORED=ON
  • ARROW-5619 - [C++] Make get_apache_mirror.py workable with Python 3.5
  • ARROW-5623 - [GLib][CI] Use system Meson on macOS
  • ARROW-5624 - [C++] Fix typo causing build failure when -Duriparser_SOURCE=BUNDLED
  • ARROW-5626 - [C++] Fix caching of expressions with decimals
  • ARROW-5629 - [C++] Fix Coverity issues
  • ARROW-5631 - [C++] Fix FindBoost targets with cmake3.2
  • ARROW-5644 - [Python] test_flight.py::test_tls_do_get appears to hang
  • ARROW-5647 - [Python] Accessing a file from Databricks using pandas read_parquet using the pyarrow engine fails with : Passed non-file path: /mnt/aa/example.parquet
  • ARROW-5648 - [C++] Avoid using codecvt
  • ARROW-5654 - [C++][Python] Add ChunkedArray::Validate method that checks chunk types for consistency, invoke in Python
  • ARROW-5657 - [C++] "docker-compose run cpp" broken in master
  • ARROW-5674 - [Python] Missing pandas pytest markers from test_parquet.py
  • ARROW-5675 - [Doc] Fix typo in Xcode workflow documentation
  • ARROW-5678 - [R][Lint] Fix hadolint docker linting error
  • ARROW-5693 - [Go] skip IPC integration tests for Decimal128
  • ARROW-5697 - [GLib] Use system pkg-config in c_glib/Dockerfile to correctly find system libraries such as libglib
  • ARROW-5698 - [R] Fix docker-compose build
  • ARROW-5709 - [C++] Fix gandiva-date_time_test failure on Windows
  • ARROW-5714 - [JS] Inconsistent behavior in Int64Builder with/without BigNum
  • ARROW-5723 - [C++][Arrow] Fix crossbow failure
  • ARROW-5728 - [Python] Pin jpype1 version to 0.6.3 due to CI breakage from 0.7.0
  • ARROW-5729 - [Python][Java] ArrowType.Int object has no attribute 'isSigned'
  • ARROW-5730 - [Python][CI] Selectively skip test cases in the dask integration test
  • ARROW-5732 - [C++] macOS builds failing idiosyncratically on master with warnings from pmmintrin.h
  • ARROW-5735 - [C++] Appveyor builds failing persistently in thrift_ep build
  • ARROW-5737 - [Crossbow] Use Python version version 2.7 in the gandiva tasks
  • ARROW-5738 - [Crossbow][Conda] OSX package builds are failing with missing intrinsics
  • ARROW-5739 - [CI] Fix python docker image
  • ARROW-5750 - [Java] Fix java compilation errors
  • ARROW-5754 - [C++] Add override mark for ~GrpcStreamWriter
  • ARROW-5765 - [C++] Fix TestDictionary.Validate in release mode, add docker-compose job for testing C++ release build
  • ARROW-5769 - [Release] Ensure setting up test data in dev/release/00-prepare.sh
  • ARROW-5770 - [C++] Fix -Wpessimizing-move in result.h
  • ARROW-5771 - [Python] Add pytz to conda_env_python.yml to fix python-nopandas build
  • ARROW-5774 - [Java][Documentation] Document the need to checkout git submodules for flight
  • ARROW-5781 - [Archery] Ensure benchmark clone accepts remote in revision
  • ARROW-5791 - [Python] pyarrow.csv.read_csv hangs + eats all RAM
  • ARROW-5816 - [Release] Parallel curl does not work reliably in verify-release-candidate-sh
  • ARROW-5922 - [Python] Unable to connect to HDFS from a worker/data node on a Kerberized cluster using pyarrow' hdfs API
  • PARQUET-1402 - [C++] Parquet files with dictionary page offset as 0 is not readable
  • PARQUET-1405 - Fix writing statistics into DataPageHeader
  • PARQUET-1405 - Fix writing statistics into DataPageHeader
  • PARQUET-1565 - [C++] Add default case to catch all unhandled physical types
  • PARQUET-1571 - [C++] Fix BufferedInputStream when buffer exactly exhausted
  • PARQUET-1574 - [C++] fix parquet-encoding-test
  • PARQUET-1581 - [C++] Fix undefined behavior in encoding.cc
kou
published 0.13.0 •

Changelog

Source

Apache Arrow 0.13.0 (2019-04-01)

Bug Fixes

  • ARROW-295 - [Documentation] Add DOAP file
  • ARROW-1171 - [C++] Segmentation faults on Fedora 24 with pyarrow-manylinux1 and self-compiled turbodbc
  • ARROW-2392 - [C++] Check schema compatibility when writing a RecordBatch
  • ARROW-2399 - [Rust] Builder<T> should not provide a set() method
  • ARROW-2598 - [Python] table.to_pandas segfault
  • ARROW-3086 - [GLib] GISCAN fails due to conda-shipped openblas
  • ARROW-3096 - [Python] Update Python source build instructions given Anaconda/conda-forge toolchain migration
  • ARROW-3133 - [C++] Remove allocation from Binary Boolean Kernels.
  • ARROW-3133 - [C++] Remove allocations from InvertKernel
  • ARROW-3208 - [C++] Fix Cast dictionary to numeric segfault
  • ARROW-3426 - [CI] Java integration test very verbose
  • ARROW-3564 - [C++] Fix dictionary encoding logic for Parquet 2.0
  • ARROW-3578 - [Release] Resolve all hard and symbolic links in tar.gz
  • ARROW-3593 - [R] CI builds failing due to GitHub API rate limits
  • ARROW-3606 - [Crossbow] Fix flake8 crossbow warnings
  • ARROW-3669 - [Python] Raise error on Numpy byte-swapped array
  • ARROW-3843 - [C++][Python] Allow a "degenerate" Parquet file with no columns
  • ARROW-3923 - [Java] JDBC Time Fetches Without Timezone
  • ARROW-4007 - [Java][Plasma] Plasma JNI tests failing
  • ARROW-4050 - [Python][Parquet] core dump on reading parquet file
  • ARROW-4081 - [Go] Sum methods panic when the array is empty
  • ARROW-4104 - [Java] race in AllocationManager during release
  • ARROW-4108 - [Python/Java] Spark integration tests do not work
  • ARROW-4117 - [Python] "asv dev" command fails with latest revision
  • ARROW-4140 - [C++][Gandiva] Compiled LLVM bitcode file path may result in libraries being non-relocatable
  • ARROW-4145 - [C++] Find Windows-compatible strptime implementation
  • ARROW-4181 - [Python] Fixes for Numpy struct array conversion
  • ARROW-4192 - [CI] Fix broken dev/run_docker_compose.sh script
  • ARROW-4213 - [Flight] Fix incompatibilities between C++ and Java
  • ARROW-4244 - [Format] Clarify padding/alignment rationale/recommendation.
  • ARROW-4250 - [C++] adding explicit epsilon for ApproxEquals and corresponding assert macro
  • ARROW-4252 - [C++] Fix missing Status code and newline
  • ARROW-4253 - [GLib] Cannot use non-system Boost specified with $BOOST_ROOT
  • ARROW-4254 - [C++][Gandiva] Build with Boost from Ubuntu Trusty apt
  • ARROW-4255 - [C++] Eagerly initialize name_to_index_ to avoid race
  • ARROW-4261 - [C++] Make CMake paths for IPC, Flight, Thrift, and Plasma subproject compatible
  • ARROW-4264 - [C++] Clarify use of DCHECKs in Kernels
  • ARROW-4267 - [C++/Parquet] Handle duplicate and struct columns in RowGroup reads
  • ARROW-4274 - [C++][Gandiva] split decimal into two parts
  • ARROW-4275 - [C++][Gandiva] Fix slow decimal test
  • ARROW-4280 - Update README.md to reflect parquet deps
  • ARROW-4282 - [Rust] builder benchmark is broken
  • ARROW-4284 - [C#] File / Stream serialization fails due to type mismatch / missing footer
  • ARROW-4295 - [C++][Plasma] Fix incorrect log message
  • ARROW-4296 - [Plasma] Use one mmap file by default, prevent crash with -f
  • ARROW-4308 - [Python] pyarrow has a hard dependency on pandas
  • ARROW-4311 - [Python] Regression on pq.ParquetWriter incorrectly handling source string
  • ARROW-4312 - [C++] Only run 2 * os.cpu_count() clang-format instances at once
  • ARROW-4319 - [C++][Plasma] plasma/store.h pulls in flatbuffer dependency
  • ARROW-4320 - [C++] Add tests for non-contiguous tensors
  • ARROW-4322 - [C++] Don't use _GLIBCXX_USE_CXX11_ABI=0 anymore in docker scripts
  • ARROW-4323 - [Packaging] Fix failing OSX clang conda forge builds
  • ARROW-4326 - [C++] Development instructions in python/development.rst will not work for many Linux distros with new conda-forge toolchain
  • ARROW-4327 - [Python] Add requirements-build.txt convenience file
  • ARROW-4328 - Add a ARROW_USE_OLD_CXXABI configure var to R
  • ARROW-4329 - Python should include the parquet headers
  • ARROW-4342 - [Gandiva][Java] Ignore flaky test.
  • ARROW-4347 - [CI][Python] Also run Python builds when Java affected.
  • ARROW-4349 - [C++] Add static linking option for benchmarks, fix Windows benchmark build failures
  • ARROW-4351 - [C++] Fix CMake errors when neither building shared libraries nor tests
  • ARROW-4355 - [C++] Reorder testing code into src/arrow/testing
  • ARROW-4360 - [C++] Query homebrew for Thrift
  • ARROW-4364 - [C++] Fix CHECKIN warnings
  • ARROW-4366 - [Docs] Change extension from format/README.md to format/README.rst
  • ARROW-4367 - [C++] StringDictionaryBuilder segfaults on Finish with only null entries
  • ARROW-4368 - [Docs] Fix install document for Ubuntu 16.04 or earlier
  • ARROW-4370 - [Python][Bool] to pandas
  • ARROW-4374 - [C++] DictionaryBuilder does not correctly report length and null_count
  • ARROW-4381 - [CI] Update linter container build instructions
  • ARROW-4382 - [C++] Improve new cpplint output readability
  • ARROW-4384 - [C++] Running "format" target on new Windows 10 install opens "how do you want to open this file" dialog
  • ARROW-4385 - [Packaging] Fix PyArrow version update pattern on release
  • ARROW-4389 - [R] Don't install clang-tools in test job
  • ARROW-4395 - [JS] Fix ts-node error running bin/arrow2csv
  • ARROW-4400 - [CI] Switch to https repo for llvm
  • ARROW-4403 - [Rust] Fix format errors
  • ARROW-4404 - [CI] AppVeyor toolchain build does not build anything
  • ARROW-4407 - [C++] Cache compiler for CMake external projects
  • ARROW-4410 - [C++] Fix edge cases in InvertKernel
  • ARROW-4413 - [Python] Fix pa.hdfs.connect() on Python 2
  • ARROW-4414 - [C++] Stop using cmake COMMAND_EXPAND_LISTS because it breaks package builds for older distros
  • ARROW-4417 - [C++] Fix doxygen build
  • ARROW-4420 - [INTEGRATION] Make spark integration test pass and test against spark's master branch
  • ARROW-4421 - [C++][Flight] Handle large RPC messages in Flight
  • ARROW-4434 - [Python] Allow creating trivial StructArray
  • ARROW-4440 - [C++] Revert recent changes to flatbuffers EP causing flakiness
  • ARROW-4457 - [Python] Allow creating Decimal array from Python ints
  • ARROW-4469 - [CI] Pin conda-forge binutils version to 2.31 for now
  • ARROW-4471 - [C++] Pass AR and RANLIB to all external projects
  • ARROW-4474 - Use signed integers in FlightInfo payload size fields
  • ARROW-4480 - [Python] Drive letter removed when writing parquet file
  • ARROW-4487 - [C++] Appveyor toolchain build does not actually build the project
  • ARROW-4494 - [Java] arrow-jdbc JAR is not uploaded on release
  • ARROW-4496 - [Python] Pin to gfortran<4
  • ARROW-4498 - [Plasma] Fix building Plasma with CUDA enabled
  • ARROW-4500 - [C++] Remove pthread / librt hacks causing linking issues in some Linux environments
  • ARROW-4501 - Fix out-of-bounds read in DoubleCrcHash
  • ARROW-4525 - [Rust][Parquet] Enable conversion of ArrowError to ParquetError
  • ARROW-4527 - [Packaging][Linux] Use LLVM 7
  • ARROW-4532 - [Java] fix bug causing very large varchar value buffers
  • ARROW-4533 - [Python] Document how to run hypothesis tests
  • ARROW-4535 - [C++] Fix MakeBuilder to preserve ListType's field name
  • ARROW-4536 - [GLib] Add data_type argument in garrow_list_array_new
  • ARROW-4538 - [Python] Remove index column from subschema in write_to_dataframe
  • ARROW-4549 - [C++] Can't build benchmark code on CUDA enabled build
  • ARROW-4550 - [JS] Fix AMD pattern
  • ARROW-4559 - [Python] Allow Parquet files with special characters in their names
  • ARROW-4563 - [Python] Validate decimal128() precision input
  • ARROW-4571 - [Format] Tensor.fbs file has multiple root_type declarations
  • ARROW-4573 - [Python] Add Flight unit tests
  • ARROW-4576 - [Python] Fix error during benchmarks
  • ARROW-4577 - [C++] Don't set interface link libs on arrow_shared where there are none
  • ARROW-4581 - [C++] Do not require googletest_ep or gbenchmark_ep for library targets
  • ARROW-4582 - [Python/C++] Acquire the GIL on Py_INCREF
  • ARROW-4584 - [Python] Add built wheel to manylinux1 dockerignore
  • ARROW-4585 - [C++] Add protoc dependency to flight_testing
  • ARROW-4587 - [C++] Fix segfaults around DoPut implementation
  • ARROW-4597 - [C++] Targets for system Google Mock shared library are missing
  • ARROW-4601 - [Python] Add license header to dockerignore
  • ARROW-4606 - [Rust] [DataFusion] FilterRelation created RecordBatch with empty schema
  • ARROW-4608 - [C++] cmake script assumes that double-conversion installs static libs
  • ARROW-4617 - [C++] Support double-conversion<3.1
  • ARROW-4624 - [C++] Fix building benchmarks
  • ARROW-4629 - [Python] Pandas arrow conversion slowed down by imports
  • ARROW-4635 - [Java] allocateNew to use last capacity
  • ARROW-4639 - [CI] Switch off GFLAGS_SHARED for osx
  • ARROW-4641 - [C++][Flight] Suppress strict aliasing warnings from "unsafe" casts in client.cc
  • ARROW-4642 - [R] change f to file in read_parquet_file()
  • ARROW-4653 - [C++] Fix bug in decimal multiply
  • ARROW-4654 - [C++] Explicit flight.cc source dependencies
  • ARROW-4657 - Don't build benchmarks in release verify script
  • ARROW-4658 - [C++] Shared gflags is also a run-time conda requirement
  • ARROW-4659 - [CI] ubuntu/debian nightlies fail because of missing gandiva files
  • ARROW-4660 - [C++] Use set_target_properties for defining GFLAGS_IS_A_DLL
  • ARROW-4664 - [C++] Do not execute expressions inside DCHECK macros in release builds
  • ARROW-4669 - [Java] Add validity checks to slice
  • ARROW-4672 - [CI] Fix clang-7 build entry
  • ARROW-4680 - [CI][Rust] Travis CI builds fail with latest Rust 1.34.0…
  • ARROW-4684 - [Python] CI failures in test_cython.py
  • ARROW-4687 - [Python] Stop Flight server on incoming signals
  • ARROW-4688 - [C++][Parquet] Chunk binary column reads at 2^31 - 1 byte boundaries to avoid splitting chunk inside nested string cell
  • ARROW-4696 - Better CUDA detection in release verification script
  • ARROW-4699 - [C++] remove json chunker's requirement of null terminated buffers
  • ARROW-4704 - [GLib][CI] Ensure killing plasma_store_server
  • ARROW-4710 - [C++][R] New linting script skip files with "cpp" extension
  • ARROW-4712 - [C++][CI] fix build (sum.cc) has warnings in clang
  • ARROW-4721 - [Rust][DataFusion] Propagate schema in filter
  • ARROW-4724 - [C++][CI] Enable Python build and test in MinGW build
  • ARROW-4728 - [JS] Fix Table#assign when passed zero-length RecordBatches
  • ARROW-4737 - run C# tests in CI
  • ARROW-4744 - [C++][CI] Change mingw builds back to debug. Cleanup up some version warnings
  • ARROW-4750 - [C++] RapidJSON triggers Wclass-memaccess on GCC 8+
  • ARROW-4760 - [C++] protobuf 3.7 defines EXPECT_OK that clashes with Arrow's macro
  • ARROW-4766 - [C++] Fix empty array cast segfault
  • ARROW-4767 - [C#] ArrowStreamReader crashes while reading the end of a stream
  • ARROW-4768 - [C++][CI] Don't run flaky tests in MinGW build
  • ARROW-4774 - [C++] Fix FileWriter::WriteTable segfault
  • ARROW-4775 - [Site] Site navbar cannot be expanded
  • ARROW-4783 - [C++][CI] Disable arrow thread-pool test on mingw to avoid appveyor timeouts
  • ARROW-4793 - [Ruby] Suppress unused variable warning
  • ARROW-4796 - [Flight/Python] Keep underlying Python object alive in FlightServerBase.do_get
  • ARROW-4802 - [Python] Follow symlinks when deriving Hadoop classpath for HDFS
  • ARROW-4807 - [Rust] Fix csv_writer benchmark
  • ARROW-4811 - [C++] Fix misbehaving CMake dependency on flight_grpc_gen
  • ARROW-4813 - [Ruby] Add tests for == and !=
  • ARROW-4820 - [Python] hadoop class path derived not correct
  • ARROW-4822 - [C++/Python] Check for None on calls to equals
  • ARROW-4828 - [Python] manylinux1 docker-compose context should be python/manylinux1
  • ARROW-4850 - [CI] Ensure integration_test.py returns non-zero on failures
  • ARROW-4853 - [Rust] Array slice doesn't work on ListArray and StructArray
  • ARROW-4857 - [C++/Python/CI] docker-compose in manylinux1 crossbow jobs too old
  • ARROW-4866 - [C++] Fix zstd_ep build for Debug, static CRT builds. Add separate CMake variable for propagating compiler toolchain to ExternalProjects
  • ARROW-4867 - [Python] Respect ordering of columns argument passed to Table.from_pandas
  • ARROW-4869 - [C++] Fix gmock usage in compute/kernels/util-internal-test.cc
  • ARROW-4870 - [Ruby] Fix mys2_mingw_dependencies
  • ARROW-4871 - [Java/Flight] Handle large Flight messages
  • ARROW-4872 - [Python] Keep backward compatibility for ParquetDatasetPiece
  • ARROW-4879 - [C++] cmake can't use conda's flatbuffers
  • ARROW-4881 - [C++] remove references to ARROW_BUILD_TOOLCHAIN
  • ARROW-4900 - [C++] polyfill __cpuidex on mingw-w64
  • ARROW-4903 - [C++] Fix static/shared-only builds
  • ARROW-4906 - [Format] Write about SparseMatrixIndexCSR format is sorted
  • ARROW-4918 - [C++] Add cmake-format to pre-commit
  • ARROW-4928 - [Python] Fix Hypothesis test failures
  • ARROW-4931 - [C++] CMake fails on gRPC ExternalProject
  • ARROW-4938 - [Glib] Undefined symbols error occurred when GIR file is being generated.
  • ARROW-4942 - [Ruby] Remove needless omits in tests
  • ARROW-4948 - [JS] Nightly test failure
  • ARROW-4950 - [C++] Fix CMake 3.2 build
  • ARROW-4952 - [C++] Floating-point comparisons should consider NaNs unequal
  • ARROW-4953 - [Ruby] Not loading libarrow-glib
  • ARROW-4954 - [Python] Fix test failure with Flight enabled
  • ARROW-4958 - [C++] Parquet benchmarks depend on its static test libs
  • ARROW-4961 - [C++] Add documentation note that GTest_SOURCE=BUNDLED is current required on Windows
  • ARROW-4962 - [C++] Warning level to CHECKIN can't compile on modern GCC
  • ARROW-4976 - [JS] Invalidate RecordBatchReader node/dom streams on reset()
  • ARROW-4982 - [GLib][CI] Run tests on AppVeyor
  • ARROW-4984 - Check if Flight gRPC server starts properly
  • ARROW-4986 - [CI] Travis fails to install llvm@7
  • ARROW-4989 - [C++] Find re2 on Ubuntu if asked to
  • ARROW-4991 - [CI] Bump travis node version to 11.12
  • ARROW-4997 - [C#] ArrowStreamReader doesn't consume whole stream and doesn't implement sync read.
  • ARROW-5009 - [C++] Remove using std::.* where I could find them
  • ARROW-5010 - [Release] Fix source release docker
  • ARROW-5012 - [C++] Install testing headers
  • ARROW-5023 - [Release] Fix default value syntax in 02-source.sh
  • ARROW-5024 - [Release] Fix missing variable with --arrow-version
  • ARROW-5025 - [Python][Packaging] Fix gandiva.dll detection
  • ARROW-5026 - [Python][Packaging] Fix gandiva.dll detection on non Windows
  • ARROW-5029 - [C++] Fix compilation warnings in release mode
  • ARROW-5031 - [Dev] Run CUDA Python tests in release verification script
  • ARROW-5042 - [Release] Use the correct dependency source in verification script
  • ARROW-5043 - [Release][Ruby] Fix dependency error in verification script
  • ARROW-5044 - [Release][Rust] Use stable toolchain for format check in verification script
  • ARROW-5046 - [Release][C++] Exclude fragile Plasma test from verification script
  • ARROW-5047 - [Release] Always set up parquet-testing in verification script
  • ARROW-5048 - [Release][Rust] Set up arrow-testing in verification script
  • ARROW-5050 - [C++] cares_ep should build before grpc_ep
  • ARROW-5087 - [Debian] APT repository no longer contains libarrow-dev
  • ARROW-5658 - [JAVA] Provide ability to resync VectorSchemaRoot if types change
  • PARQUET-1482 - [C++] Add branch to TypedRecordReader::ReadNewPage for …
  • PARQUET-1494 - [C++] Recognize statistics built with UNSIGNED sort order by parquet-mr 1.10.0 onwards
  • PARQUET-1532 - [C++] Fix build error with MinGW

New Features and Improvements

  • ARROW-47 - [C++] Preliminary arrow::Scalar object model
  • ARROW-331 - [Doc] Add statement about Python 2.7 compatibility
  • ARROW-549 - [C++] Add arrow::Concatenate function to combine multiple arrays into a single Array
  • ARROW-572 - [C++] Apply visitor pattern in IPC metadata
  • ARROW-585 - [C++] Experimental public API for user-defined extension types and arrays
  • ARROW-694 - [C++] Initial parser interface for reading JSON into RecordBatches
  • ARROW-1425 - [Python][Documentation] Examples of convert Timestamps to/from pandas via arrow
  • ARROW-1572 - [C++] Implement "value counts" kernels for tabulating value frequencies
  • ARROW-1639 - [Python] Serialize RangeIndex as metadata via Table.from_pandas instead of converting to a column of integers
  • ARROW-1642 - [GLib] Build GLib using Meson in Appveyor
  • ARROW-1807 - [JAVA] Reduce Heap Usage (Phase 3): consolidate buffers
  • ARROW-1896 - [C++] Do not allocate memory inside CastKernel. Clean up template instantiation to not generate dead identity cast code
  • ARROW-2015 - [Java] Replace Joda time with Java 8 time
  • ARROW-2022 - [Format] Add metadata to message
  • ARROW-2112 - [C++] Enable cpplint to be run on Windows
  • ARROW-2243 - [C++] Enable IPO/LTO
  • ARROW-2409 - [Rust] Deny warnings in CI.
  • ARROW-2460 - [Rust] Schema and DataType::Struct should use Vec<Rc<Field>>
  • ARROW-2487 - [C++] Provide a variant of AppendValues that takes bytemaps for the nullability
  • ARROW-2523 - [Rust] Implement CAST operations for arrays
  • ARROW-2620 - [Rust] Integrate memory pool abstraction with rest of codebase
  • ARROW-2627 - [Python] Add option to pass memory_map argument to ParquetDataset
  • ARROW-2904 - [C++] Use FirstTimeBitmapWriter instead of SetBit functions in builder.h/cc
  • ARROW-3066 - [Wiki] Add "How to contribute" to developer wiki
  • ARROW-3084 - [Python] Do we need to build both unicode variants of pyarrow wheels?
  • ARROW-3107 - [C++] arrow::PrettyPrint for Column instances
  • ARROW-3121 - [C++] Mean aggregate kernel
  • ARROW-3123 - [C++] Implement Count aggregate kernel
  • ARROW-3135 - [C++] Add helper functions for validity bitmap propagation in kernel context
  • ARROW-3149 - [C++] Use gRPC (when it exists) from conda-forge for CI builds
  • ARROW-3162 - [Python][Flight] Enable implementing Flight servers in Python
  • ARROW-3162 - Flight Python bindings
  • ARROW-3239 - [C++] Implement simple random array generation
  • ARROW-3255 - [C++/Python] Migrate Travis CI jobs off Xcode 6.4
  • ARROW-3289 - [C++] Implement Flight DoPut
  • ARROW-3292 - [C++] Test Flight RPC in Travis CI
  • ARROW-3295 - [Packaging] Package gRPC libraries in conda-forge for use in builds, packaging
  • ARROW-3297 - [Python] Python bindings for Flight C++ client
  • ARROW-3311 - [R] Functions for deserializing IPC components from arrow::Buffer or from IO interface
  • ARROW-3328 - [Flight] Allow for optional unique flight identifier to be sent with FlightGetInfo
  • ARROW-3361 - [R] Also run cpplint on Rcpp source files
  • ARROW-3364 - [Docs] Add docker-compose integration documentation
  • ARROW-3367 - [INTEGRATION] Port Spark integration test to the docker-compose setup
  • ARROW-3422 - [C++] Uniformly add ExternalProject builds to the "toolchain" target. Fix gRPC EP build on Linux
  • ARROW-3434 - [Packaging] Add Apache ORC C++ library to conda-forge
  • ARROW-3435 - [C++] Add option to use dynamic linking with re2
  • ARROW-3511 - [Gandiva] Link filter and project operations
  • ARROW-3532 - [Python] Emit warning when looking up for duplicate struct or schema fields
  • ARROW-3550 - [C++] use kUnknownNullCount for the default null_count argument
  • ARROW-3554 - [C++] Reverse traits for C++
  • ARROW-3594 - [Packaging] Build "cares" library in conda-forge
  • ARROW-3595 - [Packaging] Build boringssl in conda-forge
  • ARROW-3596 - [Packaging] Build gRPC in conda-forge
  • ARROW-3619 - [R] Expose global thread pool optins
  • ARROW-3631 - [C#] Add Appveyor configuration
  • ARROW-3653 - [C++][Python] Support data copying between different GPU devices
  • ARROW-3735 - [Python] Add test for calling cast() with None
  • ARROW-3761 - [R] Bindings for CompressedInputStream, CompressedOutputStream
  • ARROW-3763 - [C++] Write Parquet ByteArray / FixedLenByteArray reader batches directly into arrow::BinaryBuilder
  • ARROW-3769 - [C++] Add support for reading non-dictionary encoded binary Parquet columns directly as DictionaryArray
  • ARROW-3770 - [C++] Validate schema for each table written with parquet::arrow::FileWriter
  • ARROW-3816 - [R] nrow.RecordBatch method
  • ARROW-3824 - [R] Add basic build and test documentation
  • ARROW-3838 - [Rust] CSV Writer
  • ARROW-3846 - [Gandiva][C++] Build Gandiva C++ libraries and get unit tests passing on Windows
  • ARROW-3882 - [Rust] Cast Kernel for most types
  • ARROW-3903 - [Python] Random array generator for Arrow conversion and Parquet testing
  • ARROW-3926 - [Python] Add Gandiva bindings to Python manylinux1 wheels
  • ARROW-3951 - [Go] implement a CSV writer
  • ARROW-3954 - [Rust] Add Slice to Array and ArrayData
  • ARROW-3965 - [Java] JDBC-To-Arrow Configuration
  • ARROW-3966 - [Java] JDBC Column Metadata in Arrow Field Metadata
  • ARROW-3972 - [C++] Migrate to LLVM 7. Add option to disable using ld.gold
  • ARROW-3981 - [C++] Rename json.h
  • ARROW-3985 - [C++] Let ccache preserve comments
  • ARROW-4012 - [Website] Add documentation how to install Apache Arrow on MSYS2
  • ARROW-4014 - [C++] Fix "LIBCMT" warnings on MSVC
  • ARROW-4023 - [Gandiva] Address long CI times in macOS builds
  • ARROW-4024 - [Python] Raise minimal Cython version to 0.29
  • ARROW-4031 - [C++] Refactor bitmap building
  • ARROW-4040 - [Rust] Add array_ops method for filtering an array
  • ARROW-4056 - [C++] Unpin boost-cpp in conda_env_cpp.yml
  • ARROW-4061 - [Rust][Parquet] Implement spaced version for non-diction…
  • ARROW-4068 - [Gandiva] Support building with Xcode 6.4
  • ARROW-4071 - [Rust] Add rustfmt as a pre-commit hook
  • ARROW-4072 - [Rust] Set default value for PARQUET_TEST_DATA
  • ARROW-4092 - [Rust] Implement common Reader / DataSource trait for CSV and Parquet
  • ARROW-4094 - [Python] Store RangeIndex in Parquet files as metadata rather than a physical data column
  • ARROW-4110 - [C++] Do not generate distinct cast kernels when input and output type are the same
  • ARROW-4123 - [C++] Enable linting tools to be run on Windows
  • ARROW-4124 - [C++] Draft Aggregate and Sum kernels
  • ARROW-4142 - [Java] JDBC Array -> Arrow ListVector
  • ARROW-4165 - [C++] Port cpp/apidoc/Windows.md and other files to Sphinx / rst
  • ARROW-4180 - [Java] Make CI tests use logback.xml
  • ARROW-4196 - [Rust] Add explicit SIMD vectorization for arithmetic ops in "array_ops"
  • ARROW-4198 - [Gandiva] Added support to cast timestamp
  • ARROW-4204 - [Gandiva] add support for decimal subtract
  • ARROW-4205 - [Gandiva] Support for decimal multiply
  • ARROW-4206 - [Gandiva] support decimal divide and mod
  • ARROW-4212 - [C++][Python] CudaBuffer view of arbitrary device memory object
  • ARROW-4230 - [C++] Fix Flight builds with gRPC/Protobuf/c-ares
  • ARROW-4232 - [C++] Follow conda-forge compiler ABI migration
  • ARROW-4234 - [C++] Improve memory bandwidth test
  • ARROW-4235 - [GLib] Use "column_builder" in GArrowRecordBatchBuilder
  • ARROW-4236 - [java] Distinct plasma client create exceptions
  • ARROW-4245 - [Rust] Add Rustdoc header to source files
  • ARROW-4247 - [Packaging] Update verify script for 0.12.0
  • ARROW-4251 - [C++][Release] Add option to set ARROW_BOOST_VENDORED environment variable in verify-release-candidate.sh
  • ARROW-4262 - [Website] Preview to Spark with Arrow and R improvements
  • ARROW-4263 - [Rust] Donate DataFusion
  • ARROW-4265 - [C++] Automatic conversion between Table and std::vector<std::tuple<..>>
  • ARROW-4268 - [C++] Native C type TypeTraits
  • ARROW-4271 - [Rust] Move Parquet specific info to Parquet Readme
  • ARROW-4273 - [Release] Fix verification script to use cf201901 conda-forge label
  • ARROW-4277 - [C++] Add gmock to the toolchain
  • ARROW-4281 - [CI] Use Ubuntu Xenial VMs on Travis-CI
  • ARROW-4285 - [Python] Use proper builder interface for serialization
  • ARROW-4287 - [C++] Ensure minimal bison version on OSX for Thrift
  • ARROW-4289 - [C++] Forward AR and RANLIB to thirdparty builds
  • ARROW-4290 - [C++/Gandiva] Support detecting correct LLVM version in Homebrew
  • ARROW-4291 - [Dev] Support selecting features in release verification scripts
  • ARROW-4294 - [C++][Plasma] Add support for evicting Plasma objects to external store
  • ARROW-4297 - [C++] Fix build error with MinGW-w64 32-bit
  • ARROW-4298 - [Java] Add javax.annotation-api dependency for JDK >= 9
  • ARROW-4299 - [Ruby] Depend on the same version as Red Arrow
  • ARROW-4300 - [C++] Restore apache-arrow Homebrew recipe and define process for maintaining and updating for releases
  • ARROW-4303 - [Gandiva/Python] Build LLVM with RTTI in manylinux1 container
  • ARROW-4305 - [Rust] Fix parquet version number in README
  • ARROW-4307 - [C++] Fix Doxygen warnings
  • ARROW-4310 - [Website] Update install document for 0.12.0
  • ARROW-4313 - Define general benchmark database schema
  • ARROW-4315 - [Website] Add Go and Rust to list of supported languages
  • ARROW-4318 - [C++] Add Tensor::CountNonZero
  • ARROW-4321 - [CI] Setup conda-forge channel globally in docker containers
  • ARROW-4330 - [C++] More robust discovery of pthreads
  • ARROW-4331 - [C++] Extend Scalar Datum to support more types
  • ARROW-4332 - [Website] Improve documentation for publishing site
  • ARROW-4334 - [CI] Setup conda-forge channel globally in travis builds
  • ARROW-4335 - [C++] Better document sparse tensor support
  • ARROW-4336 - [C++] Change default build type to RELEASE
  • ARROW-4339 - [C++][Python] Developer documentation overhaul for 0.13 release
  • ARROW-4340 - [C++][CI] Build IWYU for LLVM 7 in iwyu docker-compose job
  • ARROW-4341 - [C++] Refactor Primitive builders and BooleanBuilder to use TypedBufferBuilder<T>
  • ARROW-4344 - [Java] Further cleanup mvn output, upgrade rat plugin
  • ARROW-4345 - [C++] Add Apache 2.0 license file to the Parquet-testing repository
  • ARROW-4346 - [C++] Fix class-memaccess warning on gcc 8.x
  • ARROW-4352 - [C++] Add support for system Google Test
  • ARROW-4353 - [CI] Add MinGW builds
  • ARROW-4358 - [CI] Restore support for trusty in CI
  • ARROW-4361 - [Website] Update commiters list
  • ARROW-4362 - [Java] Test OpenJDK 11 in CI
  • ARROW-4363 - [CI][C++] Add CMake format checks
  • ARROW-4372 - [C++] Embed precompiled bitcode in the gandiva library
  • ARROW-4373 - [Packaging] Travis fails to deploy conda packages on OSX
  • ARROW-4375 - [CI] Sphinx dependencies were removed from docs conda environment
  • ARROW-4376 - [Rust] Implement from_buf_reader for csv::Reader
  • ARROW-4377 - [Rust] Implement std::fmt::Debug for PrimitiveArrays
  • ARROW-4379 - [Python] Register serializers for collections.Counter and collections.deque.
  • ARROW-4383 - [C++] Use the CMake's standard find features
  • ARROW-4386 - [Rust] Temporal array support
  • ARROW-4388 - [Go] add DimNames() method to tensor Interface
  • ARROW-4393 - [Rust] coding style: apply 90 characters per line limit
  • ARROW-4396 - [JS] Update Typedoc for TypeScript 3.2
  • ARROW-4397 - [C++] Add dim_names in Tensor and SparseTensor
  • ARROW-4399 - [C++] Do not use extern template class with NumericArray<T> and NumericTensor<T>
  • ARROW-4401 - [Python] Alpine dockerfile fails to build because pandas requires numpy as build dependency
  • ARROW-4406 - [Python] Exclude HDFS directories in S3 from ParquetManifest
  • ARROW-4408 - [CPP/Doc] Remove outdated Parquet documentation
  • ARROW-4422 - [Plasma] Enforce memory limit in plasma, rather than relying on dlmalloc_set_footprint_limit
  • ARROW-4423 - [C++] Upgrade vendored gmock/gtest to 1.8.1
  • ARROW-4424 - [Python] Install tensorflow and keras-preprocessing in manylinux1 container
  • ARROW-4425 - Add link to 'Contributing' page in the top-level Arrow README
  • ARROW-4430 - [C++] Fix untested TypedByteBuffer<T>::Append method
  • ARROW-4431 - [C++] Fixes for gRPC vendored builds
  • ARROW-4435 - Minor fixups to csharp .sln and .csproj file
  • ARROW-4436 - [Documentation] Update building.rst to reflect pyarrow req
  • ARROW-4442 - [JS] Add explicit type annotation to Chunked typeId getter
  • ARROW-4444 - [Testing] Add DataFusion test files to arrow-testing repo
  • ARROW-4445 - [C++][Gandiva] Run Gandiva-LLVM tests in Appveyor
  • ARROW-4446 - [C++][Python] Run Gandiva C++ unit tests in Appveyor, get build and tests working in Python
  • ARROW-4448 - [Java][Flight] Disable flaky TestBackPressure
  • ARROW-4449 - [Rust] Convert File to T: Read + Seek for schema inference
  • ARROW-4454 - [C++] fix unused parameter warnings
  • ARROW-4455 - [Plasma] Suppress class-memaccess warnings
  • ARROW-4459 - [Testing] Add arrow-testing repo as submodule
  • ARROW-4460 - [Website] DataFusion Blog Post
  • ARROW-4461 - [C++] Expose bit map operations that work with raw pointers
  • ARROW-4462 - [C++] Upgrade LZ4 v1.7.5 to v1.8.3 to compile with VS2017
  • ARROW-4464 - [Rust][DataFusion] Add support for LIMIT
  • ARROW-4466 - [Rust][DataFusion] Add support for Parquet data source
  • ARROW-4468 - [Rust] Implement BitAnd/BitOr for &Buffer (with SIMD) (#3571)
  • ARROW-4472 - [Website][Python] Blog post about string memory use work in Arrow 0.12
  • ARROW-4475 - [Python] Fix recursive serialization of self-containing objects
  • ARROW-4476 - [Rust][DataFusion] Update README to cover DataFusion and new testing git submodule
  • ARROW-4481 - [Website] Remove generated specification docs from site after docs migration
  • ARROW-4483 - [Website] Add myself to contributors.yaml to fix broken link in blog post
  • ARROW-4485 - [CI] Determine maintenance approach to pinned conda-forge binutils package
  • ARROW-4486 - [Python][CUDA] Add base argument to foreign_buffer
  • ARROW-4488 - [Rust][u8] > for Buffer does not ensure correct padding
  • ARROW-4489 - [Rust] PrimitiveArray.value_slice performs bounds checking when it should not
  • ARROW-4490 - [Rust] Add explicit SIMD vectorization for boolean ops in "array_ops"
  • ARROW-4491 - [Python] Use StringConverter and stringstream instead of std::stoi and std::to_string
  • ARROW-4499 - [CI] Unpin flake8 in lint script, fix warnings in dev/
  • ARROW-4502 - [C#] Add support for zero-copy reads
  • ARROW-4506 - [Ruby] Add Arrow::RecordBatch#raw_records
  • ARROW-4513 - [Rust] Implement BitAnd/BitOr for &Bitmap
  • ARROW-4517 - [JS] remove version number as it is not used
  • ARROW-4518 - [JS] add jsdelivr to package.json
  • ARROW-4528 - [C++] Update lint docker container to LLVM-7
  • ARROW-4529 - [C++] Add test for BitUtil::RoundDown
  • ARROW-4531 - [C++] Support slices for SumKernel
  • ARROW-4537 - [CI] Suppress shell warning on travis-ci
  • ARROW-4539 - [Java] Fix child vector count for lists. (#3625)
  • ARROW-4540 - [Rust] Basic JSON reader
  • ARROW-4543 - [C#] Update Flat Buffers code to latest version
  • ARROW-4546 - Update LICENSE.txt with parquet-cpp licenses
  • ARROW-4547 - [Python][Documentation] Update python/development.rst with instructions for CUDA-enabled builds
  • ARROW-4556 - [Rust] Preserve JSON field order when inferring schema
  • ARROW-4558 - [C++][Flight] Implement gRPC customizations without UB
  • ARROW-4560 - [R] array() needs to take single input, not ...
  • ARROW-4562 - [C++] Avoid copies when serializing Flight data
  • ARROW-4564 - [C++] IWYU docker image silently fails
  • ARROW-4565 - [R] Fix decimal record batches with no null values
  • ARROW-4568 - [C++] Add version macros to headers
  • ARROW-4572 - [C++] Remove memory zeroing from PrimitiveAllocatingUnaryKernel
  • ARROW-4583 - [Plasma] Fix some small bugs reported by code scan tool
  • ARROW-4586 - [Rust] Remove arrow/mod.rs as it is not needed
  • ARROW-4589 - [Rust] Projection push down query optimizer rule
  • ARROW-4590 - [Rust] Add explicit SIMD vectorization for comparison ops in "array_ops"
  • ARROW-4592 - [GLib] Stop configure immediately when GLib isn't available
  • ARROW-4593 - [Ruby][out_of_range] returns nil
  • ARROW-4594 - [Ruby] returns Arrow::Struct instead of Arrow::Array
  • ARROW-4595 - [Rust] Implement Table API (a.k.a DataFrame)
  • ARROW-4598 - [CI] Remove needless LLVM_DIR for macOS
  • ARROW-4599 - [C++] Add support for system GFlags
  • ARROW-4602 - [Rust][DataFusion] Integrate query optimizer with ExecutionContext
  • ARROW-4603 - [Rust] [DataFusion] Execution context should allow in-memory data sources to be registered
  • ARROW-4604 - [Rust] [DataFusion] Add benchmarks for SQL query execution
  • ARROW-4605 - [Rust] Move filter and limit code from DataFusion into compute module
  • ARROW-4609 - [C++] Use google benchmark from toolchain
  • ARROW-4610 - [Plasma] Avoid Crash in Plasma Java Client
  • ARROW-4611 - [C++] Rework CMake logic
  • ARROW-4612 - [Python] Use cython from PyPI for windows wheels build
  • ARROW-4613 - [C++] Set CMAKE_INSTALL_LIBDIR in gtest thirdparty build
  • ARROW-4614 - [C++/CI] Activate flight build in ci/docker_build_cpp.sh
  • ARROW-4615 - [C++] Add checked_pointer_cast
  • ARROW-4616 - [C++] Log message in BuildUtils as STATUS
  • ARROW-4618 - [Docker] Makefile to build dependent docker images
  • ARROW-4619 - [R] Fix the autobrew script
  • ARROW-4620 - [C#] Add unit tests for "Types" in arrow/csharp
  • ARROW-4623 - [R] update Rcpp version
  • ARROW-4628 - [Rust][DataFusion] Implement type coercion query optimizer rule
  • ARROW-4632 - [Ruby] Add BigDecimal#to_arrow
  • ARROW-4634 - [Rust][Parquet] Reorganize test_common
  • ARROW-4637 - [Python] Conditionally import pandas symbols if they are used. Do not require pandas as a test dependency
  • ARROW-4638 - [R] install instructions using brew
  • ARROW-4640 - [Python] Add docker-compose configuration to build and test the project without pandas installed
  • ARROW-4643 - [C++] Force compiler diagnostic colors
  • ARROW-4644 - [C++/Docker] Build Gandiva in the docker containers
  • ARROW-4645 - [C++/Packaging] Ship Gandiva with OSX and Windows wheels
  • ARROW-4646 - [C++/Packaging] Ship gandiva with the conda-forge packages
  • ARROW-4655 - [Packaging] Parallelize binary upload
  • ARROW-4662 - [Python] Add support of type_codes in UnionType
  • ARROW-4667 - [C++] Suppress unused function warnings with MinGW
  • ARROW-4670 - [Rust] array_ops::sum performance optimizations
  • ARROW-4671 - [C++] MakeBuilder doesn't support Type::DICTIONARY
  • ARROW-4673 - [C++] Implement Scalar::Equals and Datum::Equals
  • ARROW-4676 - [C++] Add support for debug build with MinGW
  • ARROW-4678 - [Rust] Minimize unstable feature usage
  • ARROW-4679 - [Rust] Implement in-memory data source for DataFusion
  • ARROW-4681 - [Rust][DataFusion] Partition aware data sources
  • ARROW-4686 - [Dev] Only accept 'y' or 'n' in merge_arrow_pr.py prompts
  • ARROW-4689 - [Go] Add support for wasm
  • ARROW-4690 - Building TensorFlow compatible wheels for Arrow
  • ARROW-4692 - [Flight] Explain sidecar in a bit more detail
  • ARROW-4693 - [CI] Build boost with multiprecision
  • ARROW-4697 - [C++] Add URI parsing facility
  • ARROW-4703 - [C++] Upgrade dependency versions
  • ARROW-4705 - [Rust] Improve error handling in csv reader
  • ARROW-4707 - [C++] moving BitsetStack to BitUtil::
  • ARROW-4718 - [C#] Add ArrowStreamReader/Writer ctor with bool leaveOpen
  • ARROW-4727 - [Rust] Add equality check for schemas
  • ARROW-4730 - [C++] Add docker-compose entry for testing Fedora build with system packages
  • ARROW-4731 - [C++] Add docker-compose entry for testing Ubuntu Xenial build with system packages
  • ARROW-4732 - [C++] Add docker-compose entry for testing Debian Testing build with system packages
  • ARROW-4733 - [C++] Add CI entry that builds without the conda-forge toolchain but with system packages
  • ARROW-4734 - [Go] Add option to write a header for CSV writer
  • ARROW-4735 - [Go] Optimize CSV writer CPU/Mem performances
  • ARROW-4739 - [Rust] LogicalPlan can now be passed to threads
  • ARROW-4740 - [Java] Upgrade to JUnit 5.
  • ARROW-4743 - [Java] Add javadoc missing in classes and methods in java…
  • ARROW-4745 - [C++][Documentation] Document notes from replicating Static_Crt_Build on windows
  • ARROW-4749 - [Rust] Return Result for RecordBatch::new()
  • ARROW-4751 - [C++] Add pkg-config to conda_env_cpp.yml now that it's available on Windows
  • ARROW-4754 - [Java] Randomize port and retry binding server when bind fails
  • ARROW-4756 - Update readme for triggering docker builds
  • ARROW-4758 - [C++][Flight] Fix intermittent build failure
  • ARROW-4769 - [Rust] Improve array limit fn where max_records >= len
  • ARROW-4772 - [C++] new ORC adapter interface for stripe and row iteration
  • ARROW-4776 - [C++] Add DictionaryBuilder constructor which takes a dictionary array
  • ARROW-4777 - [C++/Python] manylinux1: Update lz4 to 1.8.3
  • ARROW-4778 - [C++/Python] manylinux1: Update Thrift to 0.12.0
  • ARROW-4782 - [C++] Prototype array and scalar expression types to help with building an deferred compute graph
  • ARROW-4786 - [C++/Python] Support better parallelisation in manylinux1 base build
  • ARROW-4789 - [C++] Deprecate and and later remove arrow::io::ReadableFileInterface
  • ARROW-4790 - [Python/Packaging] Update manylinux docker image in crossbow task
  • ARROW-4791 - [Rust] Remove unused dependencies
  • ARROW-4794 - [Python] Make pandas an optional test dependency
  • ARROW-4797 - [Plasma] Allow client to check store capacity and avoid server crash
  • ARROW-4801 - [GLib] Suppress Meson warnings
  • ARROW-4808 - [Java][Vector] More util methods to set decimal vector.
  • ARROW-4812 - [Rust] [DataFusion] Table.scan() should return one iterator per partition
  • ARROW-4817 - [Rust] [DataFusion] Small re-org of modules
  • ARROW-4818 - [Rust] [DataFusion] Parquet data source does not support null values
  • ARROW-4826 - [Go] export Flush method for CSV writer
  • ARROW-4831 - [C++] CMAKE_AR is not passed to ZSTD thirdparty dependency
  • ARROW-4833 - [Release] Document how to update the brew formula in the release management guide
  • ARROW-4834 - [R] Feature flag when building parquet
  • ARROW-4835 - [GLib] Add boolean operations
  • ARROW-4837 - [C++] Support c++filt on a custom path in the run-test.sh script
  • ARROW-4839 - [C#] Add NuGet package metadata and instructions.
  • ARROW-4843 - [Rust] [DataFusion] Parquet data source should support DATE
  • ARROW-4846 - [Java] Upgrade to jackson 2.9.8
  • ARROW-4849 - [C++] Add docker-compose entry for testing Ubuntu Bionic build with system packages
  • ARROW-4854 - [Rust] Use zero-copy slice for limit kernel
  • ARROW-4855 - [Packaging] Generate default package version based on cpp tags in crossbow.py
  • ARROW-4858 - [Flight/Python] enable FlightDataStream to be implemented in Python
  • ARROW-4859 - [GLib] Add garrow_numeric_array_mean()
  • ARROW-4862 - [C++] Fix gcc warnings in CHECKIN
  • ARROW-4862 - [GLib] Add GArrowCastOptions::allow-invalid-utf8 property
  • ARROW-4865 - [Rust] Support list casts
  • ARROW-4873 - [C++] Clarify documentation about how to use external ARROW_PACKAGE_PREFIX while also using CONDA dependency resolution
  • ARROW-4878 - [C++] Append \Library to CONDA_PREFIX when using ARROW_DEPENDENCY_SOURCE=CONDA
  • ARROW-4882 - [GLib] Add sum functions
  • ARROW-4887 - [GLib] Add garrow_array_count()
  • ARROW-4889 - [C++] Add STATUS messages for Protobuf in CMake
  • ARROW-4891 - [C++] Add zlib headers to include directories
  • ARROW-4892 - [Rust][DataFusion] Move SQL parser and planner into SQL module
  • ARROW-4893 - [C++] conda packages should use inside of conda-build
  • ARROW-4894 - [Rust][DataFusion] Remove all uses of panic! from aggregate.rs
  • ARROW-4895 - [Rust][DataFusion] Move error.rs to root of crate
  • ARROW-4896 - [Rust][DataFusion] Remove all uses of panic! from DataFusion tests
  • ARROW-4897 - [Rust][DataFusion] Improve rustdocs
  • ARROW-4898 - [C++] Old versions of FindProtobuf.cmake use ALL-CAPS for variables
  • ARROW-4899 - [Rust][DataFusion] Remove panic from expression.rs
  • ARROW-4901 - [Go] add AppVeyor CI
  • ARROW-4905 - [C++][Plasma] Remove dlmalloc symbols from client library
  • ARROW-4907 - [CI] Add docker container to inspect docker context
  • ARROW-4908 - [Rust][DataFusion] Add support for date/time parquet types encoded as INT32/INT64
  • ARROW-4909 - [CI] Use hadolint to lint Dockerfiles
  • ARROW-4910 - [Rust][DataFusion] Remove all uses of unimplemented!
  • ARROW-4915 - [GLib][C++] Add arrow::NullBuilder support for GLib
  • ARROW-4922 - [Packaging] Use system libraries for .deb and .rpm
  • ARROW-4924 - [Ruby] Add Decimal128#to_s(scale=nil)
  • ARROW-4925 - [Rust] [DataFusion] Remove duplicate implementations of collect_expr
  • ARROW-4926 - [Rust][DataFusion] Update README for 0.13.0
  • ARROW-4929 - [GLib] Add garrow_array_count_values()
  • ARROW-4932 - [GLib] Use G_DECLARE_DERIVABLE_TYPE macro
  • ARROW-4933 - [R] Autodetect Parquet support using pkg-config
  • ARROW-4937 - [R] Clean pkg-config related logic
  • ARROW-4939 - [Python] Add wrapper for "sum" kernel
  • ARROW-4940 - [Rust] Enable warnings for missing docs, add docs in datafusion
  • ARROW-4944 - [C++] Raise minimal required thrift-cpp to 0.11 in conda environment
  • ARROW-4946 - [C++] Support detection of flatbuffers without FlatbuffersConfig.cmake
  • ARROW-4947 - [Flight/C++] Remove redundant schema parameter to Flight client DoGet
  • ARROW-4951 - [C++] Turn off cpp benchmarks in cpp docker images
  • ARROW-4955 - [GLib] Add garrow_file_is_closed()
  • ARROW-4964 - [Ruby] Add closed check if available on auto close
  • ARROW-4969 - [C++] Set RPATH in correct order for test executables on OSX
  • ARROW-4977 - [Ruby] Add support for building on Windows
  • ARROW-4978 - [Ruby] Fix wrong internal variable name for table data
  • ARROW-4979 - [GLib] Add missing lock to garrow::GIOInputStream
  • ARROW-4980 - [GLib] Use GInputStream as the parent of GArrowInputStream
  • ARROW-4981 - [Ruby] Add support for CSV data encoding conversion
  • ARROW-4983 - [Plasma] Unmap memory upon destruction of the PlasmaClient
  • ARROW-4994 - [Website] Update details for ptgoetz
  • ARROW-4995 - [R] Support for winbuilder for CRAN checks
  • ARROW-4996 - [Plasma] Enable uninstalling of signal handler and fix log_dir
  • ARROW-5003 - [R] remove dependency on withr
  • ARROW-5006 - [R] parquet.cpp does not include enough Rcpp
  • ARROW-5011 - [Release] Add support in source release script for custom git hash
  • ARROW-5013 - [Rust][DataFusion] Refactor runtime expression support
  • ARROW-5014 - [Java] Fix typos in Flight module
  • ARROW-5018 - [Release] Include JavaScript implementation
  • ARROW-5032 - [C++] Install headers in vendored/datetime directory
  • ARROW-5041 - [C++] add GTest_SOURCE=BUNDLED to verify-release-candidate.bat
  • ARROW-5075 - [Release] Add 0.13.0 release note
  • ARROW-5084 - [Website] Blog post / release announcement for 0.13.0
  • PARQUET-1477 - [C++] sync thrift to final crypto spec
  • PARQUET-1508 - [C++] Read ByteArray data directly into arrow::BinaryBuilder and BinaryDictionaryBuilder. Refactor encoders/decoders to use cleaner virtual interfaces
  • PARQUET-1519 - [C++] Hide TypedColumnReader implementation behind virtual interfaces, remove use of "extern template class"
  • PARQUET-1521 - [C++] Use pure virtual interfaces for parquet::TypedColumnWriter, remove use of 'extern template class'
  • PARQUET-1525 - [C++] remove dependency on getopt in parquet tools
kszucs
published 0.4.1 •

Changelog

Source

Apache Arrow 0.4.1 (2017-06-09)

Bug Fixes

  • ARROW-424 - [C++] Make ReadAt, Write HDFS functions threadsafe
  • ARROW-1039 - Python: pyarrow.Filesystem.read_parquet causing error if nthreads>1
  • ARROW-1050 - [C++] Export arrow::ValidateArray
  • ARROW-1051 - [Python] Opt in to Parquet unit tests to avoid accidental suppression of dynamic linking errors
  • ARROW-1056 - [Python] Ignore pandas index in parquet+hdfs test
  • ARROW-1057 - Fix cmake warning and msvc debug asserts
  • ARROW-1060 - [Python] Add unit tests for reference counts in memoryview interface
  • ARROW-1062 - [GLib] Follow API changes in examples
  • ARROW-1066 - [Python] pandas 0.20.1 deprecation of pd.lib causes a warning on import
  • ARROW-1070 - [C++] Use physical types for Feather date/time types
  • ARROW-1075 - [GLib] Fix build error on macOS
  • ARROW-1082 - [GLib] Add CI on macOS
  • ARROW-1085 - [java] Follow up on template cleanup. Missing method for …
  • ARROW-1086 - include additional pxd files during package build
  • ARROW-1088 - [Python] Only test unicode filenames if system supports them
  • ARROW-1090 - Improve build_ext usability with --bundle-arrow-cpp
  • ARROW-1091 - Decimal scale and precision are flipped
  • ARROW-1092 - More Decimal and scale flipped follow-up
  • ARROW-1094 - [C++] Always truncate buffer read in ReadableFile::Read if actual number of bytes less than request
  • ARROW-1127 - pyarrow 4.1 import failure on Travis

New Features and Improvements

  • ARROW-897 - [GLib] Extract CI configuration for GLib
  • ARROW-986 - [Format] Add brief explanation of dictionary batches in IPC.md
  • ARROW-990 - [JS] Add tslint support for linting TypeScript
  • ARROW-1020 - [Format] Revise language for Timestamp type in Schema.fbs to avoid possible confusion about tz-naive timestamps
  • ARROW-1034 - [PYTHON] Resolve wheel build issues on Windows
  • ARROW-1049 - [java] vector template cleanup
  • ARROW-1063 - [Website] Updates for 0.4.0 release, release posting
  • ARROW-1068 - [Python] Create external repo with appveyor.yml configured for building Python wheel installers
  • ARROW-1069 - Add instructions for publishing maven artifacts
  • ARROW-1078 - [Python] Account for Apache Parquet shared library consolidation
  • ARROW-1080 - C++: Add tutorial about converting to/from row-wise representation
  • ARROW-1084 - Implementations of BufferAllocator should handle Netty's OutOfDirectMemoryError
  • ARROW-1118 - [Website] Site updates for 0.4.1
xhochy
published 0.4.0 •

Changelog

Source

Apache Arrow 0.4.0 (2017-05-22)

Bug Fixes

  • ARROW-813 - [Python] setup.py sdist must also bundle dependent cmake m…
  • ARROW-824 - Date and Time Vectors should reflect timezone-less semantics
  • ARROW-856 - Also read compiler info from stdout
  • ARROW-909 - Link jemalloc statically if build as external project
  • ARROW-939 - fix division by zero if one of the tensor dimensions is zero
  • ARROW-940 - [JS] Generate multiple artifacts
  • ARROW-944 - Python: Compat broken for pandas==0.18.1
  • ARROW-948 - [GLib] Update C++ header file list
  • ARROW-952 - fix regex include from C++ standard library
  • ARROW-958 - [Python] Fix conda source build instructions
  • ARROW-979 - [Python] Fix setuptools_scm version when release tag is not in the master timeline
  • ARROW-991 - [Python] Create new dtype when deserializing from Arrow to NumPy datetime64
  • ARROW-995 - [Website] Fix a typo
  • ARROW-998 - [Format] Clarify that the IPC file footer contains an additional copy of the schema
  • ARROW-1003 - [C++] Check flag _WIN32 instead of __WIN32
  • ARROW-1004 - [Python] Add conversions for numpy object arrays with integers and floats
  • ARROW-1017 - [Python] Fix memory leaks in conversion to pandas.DataFrame
  • ARROW-1023 - Python: Fix bundling of arrow-cpp for macOS
  • ARROW-1033 - [Python] pytest discovers scripts/test_leak.py
  • ARROW-1045 - [JAVA] Add support for custom metadata in org.apache.arrow.vector.types.pojo.*
  • ARROW-1046 - [Python] Reconcile pandas metadata spec
  • ARROW-1053 - [Python] Remove unnecessary Py_INCREF in PyBuffer causing memory leak
  • ARROW-1054 - [Python] Test suite fails on pandas 0.19.2
  • ARROW-1061 - [C++] Harden decimal parsing against invalid strings
  • ARROW-1064 - ModuleNotFoundError: No module named 'pyarrow._parquet'

New Features and Improvements

  • ARROW-29 - [C++] FindRe2 cmake module
  • ARROW-182 - [C++] Factor out Array::Validate into a separate function
  • ARROW-376 - Python: Convert non-range Pandas indices (optionally) to Arrow
  • ARROW-446 - [Python] Expand Sphinx documentation for 0.3
  • ARROW-482 - [Java] Exposing custom field metadata
  • ARROW-532 - [Python] Expand pyarrow.parquet documentation for 0.3 release
  • ARROW-579 - Python: Provide redistributable pyarrow wheels on OSX
  • ARROW-596 - [Python] Add convenience function to convert pandas.DataFrame to pyarrow.Buffer containing a file or stream representation
  • ARROW-629 - [JS] Add unit test suite
  • ARROW-714 - [C++] Add import_pyarrow C API in the style of NumPy for thirdparty C++ users
  • ARROW-819 - Public Cython and C++ API in the style of lxml, arrow::py::import_pyarrow method
  • ARROW-872 - [JS] Read streaming format
  • ARROW-873 - [JS] Implement fixed width list type
  • ARROW-874 - [JS] Read dictionary-encoded vectors
  • ARROW-881 - [Python] Reconstruct Pandas DataFrame indexes using metadata
  • ARROW-891 - [Python] Expand Windows build instructions to not require looking at separate C++ docs
  • ARROW-899 - [Doc] Add 0.3.0 changelog
  • ARROW-901 - [Python] Add Parquet unit test for fixed size binary
  • ARROW-913 - [Python] Only link jemalloc to the Cython extension where it's needed
  • ARROW-923 - Changelog generation Python script, add 0.1.0 and 0.2.0 changelog
  • ARROW-929 - Remove KEYS file from git
  • ARROW-943 - [GLib] Support running unit tests with source archive
  • ARROW-945 - [GLib] Add a Lua example to show Torch integration
  • ARROW-946 - [GLib] Use "new" instead of "open" for constructor name
  • ARROW-947 - [Python] Improve execution time of manylinux1 build
  • ARROW-953 - Use conda-forge cmake, curl in CI toolchain
  • ARROW-954 - Flag for compiling Arrow with header-only boost
  • ARROW-956 - [Python] compat with pandas >= 0.20.0
  • ARROW-957 - [Doc] Add HDFS and Windows documents to doxygen output
  • ARROW-961 - [Python] Rename InMemoryOutputStream to BufferOutputStream
  • ARROW-963 - [GLib] Add equal
  • ARROW-967 - [GLib] Support initializing array with buffer
  • ARROW-970 - [Python] Nicer experience if user accidentally calls pyarrow.Table ctor directly
  • ARROW-977 - [java] Add Timezone aware timestamp vectors
  • ARROW-980 - Fix detection of "msvc" COMPILER_FAMILY
  • ARROW-982 - [Website] Improve website front copy to highlight serialization efficiency benefits
  • ARROW-984 - [GLib] Add Go examples
  • ARROW-985 - [GLib] Update package information
  • ARROW-988 - [JS] Add entry to Travis CI matrix
  • ARROW-993 - [GLib] Add missing error checks in Go examples
  • ARROW-996 - [Website] Add 0.3.0 release announce in Japanese
  • ARROW-997 - [Java] Implementing transferPair for FixedSizeListVector
  • ARROW-1000 - [GLib] Move install document to Website
  • ARROW-1001 - [GLib] Unify writer files
  • ARROW-1002 - [C++] Fix inconsistency with padding at start of IPC file format
  • ARROW-1008 - [C++] Add abstract stream writer and reader C++ APIs. Give clearer names to IPC reader/writer classes
  • ARROW-1010 - [Website] Provide for translations without repeating blog post in blogroll
  • ARROW-1011 - [FORMAT] fix typo and mistakes in Layout.md
  • ARROW-1014 - 0.4.0 release
  • ARROW-1015 - [Java] Schema-level metadata
  • ARROW-1016 - Python: Include C++ headers (optionally) in wheels
  • ARROW-1022 - [Python] Add multithreaded read option to read_feather
  • ARROW-1024 - Python: Update build time numpy version to 1.10.1
  • ARROW-1025 - [Website] Improved changelog for website, include git shortlog
  • ARROW-1027 - [Python] Allow negative indexing in fields/columns on pyarrow Table and Schema objects
  • ARROW-1028 - [Python] Fix IPC docs per API changes
  • ARROW-1029 - [Python] Fixes for building pyarrow with Parquet support on MSVC. Add to appveyor build
  • ARROW-1030 - Python: Account for library versioning in parquet-cpp
  • ARROW-1031 - [GLib] Support pretty print
  • ARROW-1037 - [GLib] Follow reader name change
  • ARROW-1038 - [GLib] Follow writer name change
  • ARROW-1040 - [GLib] Support tensor IO
  • ARROW-1044 - [GLib] Support Feather
  • ARROW-1126 - Python: Add function to convert NumPy/Pandas dtypes to Arrow DataTypes
wesm
published 0.3.1 •

ptaylor
published 0.3.0 •

Changelog

Source

Apache Arrow 0.3.0 (2017-05-05)

Bug Fixes

  • ARROW-109 - [C++] Add nesting stress tests up to 500 recursion depth
  • ARROW-208 - Add checkstyle policy to java project
  • ARROW-347 - Add method to pass CallBack when creating a transfer pair
  • ARROW-413 - DATE type is not specified clearly
  • ARROW-431 - [Python] Review GIL release and acquisition in to_pandas conversion
  • ARROW-443 - [Python] Support ingest of strided NumPy arrays from pandas
  • ARROW-451 - [C++] Implement DataType::Equals as TypeVisitor. Add default implementations for TypeVisitor, ArrayVisitor methods
  • ARROW-454 - pojo.Field doesn't implement hashCode()
  • ARROW-526 - [Format] Revise Format documents for evolution in IPC stream / file / tensor formats
  • ARROW-565 - [C++] Examine "Field::dictionary" member
  • ARROW-570 - Determine Java tools JAR location from project metadata
  • ARROW-584 - [C++] Fix compiler warnings exposed with -Wconversion
  • ARROW-586 - Problem with reading parquet files saved by Apache Spark
  • ARROW-588 - [C++] Fix some 32 bit compiler warnings
  • ARROW-595 - [Python] Set schema attribute on StreamReader
  • ARROW-604 - Python: boxed Field instances are missing the reference to their DataType
  • ARROW-611 - [Java] TimeVector TypeLayout is incorrectly specified as 64 bit width
  • ARROW-613 - WIP TypeScript Implementation
  • ARROW-617 - [Format] Add additional Time metadata and comments based on discussion in ARROW-617
  • ARROW-619 - [Python] Fixed remaining typo for LD_LIBRARY_PATH
  • ARROW-619 - Fix typos in setup.py args and LD_LIBRARY_PATH
  • ARROW-623 - Fix segfault in repr of empty field
  • ARROW-624 - [C++] Restore MakePrimitiveArray function, use in feather.cc
  • ARROW-627 - [C++] Add compatibility macros for exported extern templates
  • ARROW-628 - [Python] Install nomkl metapackage when building parquet-cpp in Travis CI
  • ARROW-630 - [C++] Create boolean batches for IPC testing, properly account for nonzero offset
  • ARROW-636 - [C++] Update README about Boost system requirement
  • ARROW-639 - [C++] Invalid offset in slices
  • ARROW-642 - [Java] Remove temporary file in java/tools
  • ARROW-644 - Python: Cython should be a setup-only requirement
  • ARROW-652 - Remove trailing f in merge script output
  • ARROW-654 - [C++] Serialize timezone in IPC metadata
  • ARROW-666 - [Python] Error in DictionaryArray __repr__
  • ARROW-667 - build of arrow-master/cpp fails with altivec error?
  • ARROW-668 - [Python] Box timestamp values as pandas.Timestamp if available, attach tzinfo
  • ARROW-671 - [GLib] Install missing license file
  • ARROW-673 - [Java] Support additional Time metadata
  • ARROW-677 - [java] Fix checkstyle jcl-over-slf4j conflict issue
  • ARROW-678 - [GLib] Fix dependencies
  • ARROW-680 - [C++] Support CMake 2 or older again
  • ARROW-682 - [Integration] Check implementations against themselves
  • ARROW-683 - [C++/Python] Refactor to make Date32 and Date64 types for new metadata. Test IPC roundtrip
  • ARROW-685 - [GLib] AX_CXX_COMPILE_STDCXX_11 error running ./configure
  • ARROW-686 - [C++] Account for time metadata changes, add Time32 and Time64 types
  • ARROW-689 - [GLib] Fix install directories
  • ARROW-691 - [Java] Encode dictionary type in message format
  • ARROW-697 - JAVA Throw exception for record batches > 2GB
  • ARROW-699 - [C++] Resolve Arrow and Arrow IPC build issues on Windows;
  • ARROW-702 - fix BitVector.copyFromSafe to reAllocate instead of returning false
  • ARROW-703 - Fix issue where setValueCount(0) doesn’t work in the case that we’ve shipped vectors across the wire
  • ARROW-704 - Fix bad import caused by conflicting changes
  • ARROW-709 - [C++] Restore type comparator for DecimalType
  • ARROW-713 - [C++] Fix cmake linking issue in new IPC benchmark
  • ARROW-715 - [Python] Make pandas not a hard requirement, flake8 fixes
  • ARROW-716 - [Python] Update README build instructions after moving libpyarrow to C++ tree
  • ARROW-720 - arrow should not have a dependency on slf4j bridges in com…
  • ARROW-723 - [Python] Ensure that passing chunk_size=0 when writing Parquet file does not enter infinite loop
  • ARROW-726 - [C++] Fix segfault caused when passing non-buffer object to arrow::py::PyBuffer
  • ARROW-732 - [C++] Schema comparison bugs in struct and union types
  • ARROW-736 - [Python] Mixed-type object DataFrame columns should not silently co…
  • ARROW-738 - Fix manylinux1 build
  • ARROW-739 - Don't install jemalloc in parallel
  • ARROW-740 - FileReader fails for large objects
  • ARROW-747 - [C++] Calling add_dependencies with dl causes spurious CMake warning
  • ARROW-749 - [Python] Delete partially-written Feather file when column write fails
  • ARROW-753 - [Python] Fix linker error for python-test on OS X
  • ARROW-756 - [C++] MSVC build fixes and cleanup, remove -fPIC flag from EP builds on Windows, Dev docs
  • ARROW-757 - [C++] MSVC build fails on googletest when using NMake
  • ARROW-762 - [Python] Start docs page about files and filesystems, adapt C++ docs about HDFS
  • ARROW-776 - [GLib] Fix wrong type name
  • ARROW-777 - restore getObject behavior on Date and Time
  • ARROW-778 - Port merge tool to work on Windows
  • ARROW-780 - PYTHON_EXECUTABLE Required to be set during build
  • ARROW-781 - [C++/Python] Increase reference count of the numpy base array?
  • ARROW-783 - [Java/C++] Fixes for 0-length record batches
  • ARROW-787 - [GLib] Fix compilation error caused by introducing BooleanBuilder::Append overload
  • ARROW-789 - Fix issue where setValueCount(0) doesn’t work in the case that we’ve shipped vectors across the wire
  • ARROW-793 - [GLib] Fix indent
  • ARROW-794 - [C++/Python] Disallow strided tensors in ipc::WriteTensor
  • ARROW-796 - [Java] Checkstyle additions causing build failure in some environments
  • ARROW-797 - [Python] Make more explicitly curated public API page, sphinx cleanup
  • ARROW-800 - [C++] Boost headers being transitively included in pyarrow
  • ARROW-805 - [C++] Don't throw IOError when listing empty HDFS dir
  • ARROW-809 - [C++] Do not write excess bytes in IPC writer after slicing arrays
  • ARROW-812 - Pip install pyarrow on mac failed.
  • ARROW-817 - [Python] Fix comment in date32 conversion
  • ARROW-821 - [Python] Extra file _table_api.h generated during Python build process
  • ARROW-822 - [Python] StreamWriter Wrapper for Socket and File-like Objects without tell()
  • ARROW-826 - [C++/Python] Fix compilation error on Mac with -DARROW_PYTHON=on
  • ARROW-829 - Don't deactivate Parquet dictionary encoding on column-wis…
  • ARROW-830 - [Python] Expose jemalloc memory pool and other memory pool functions in public pyarrow API
  • ARROW-836 - add test for pandas conversion of timedelta, currently unimplemented
  • ARROW-839 - [Python] Use mktime variant that is reliable on MSVC
  • ARROW-847 - Specify BUILD_BYPRODUCTS for gtest
  • ARROW-852 - Also search for ARROW libs when pkg-config provided the path
  • ARROW-853 - [Python] Only set RPATH when bundling the shared libraries
  • ARROW-858 - Remove boost_regex from arrow dependencies
  • ARROW-866 - [Python] Be robust to PyErr_Fetch returning a null exc value
  • ARROW-867 - [Python] pyarrow MSVC fixes
  • ARROW-875 - Avoid setting an extra empty in fillEmpties()
  • ARROW-879 - compat with pandas v0.20.0
  • ARROW-882 - [C++] Rename statically build library on Windows to avoid …
  • ARROW-883 - [JAVA] Introduction of new types has shifted Enumerations
  • ARROW-885 - [Python/C++] Decimal test failure on MSVC
  • ARROW-886 - [Java] Fixing reallocation of VariableLengthVector offsets
  • ARROW-887 - add default value to units for backward compatibility
  • ARROW-888 - Transfer ownership of buffer in BitVector transferTo()
  • ARROW-895 - Fix lastSet in fillEmpties() and copyFrom()
  • ARROW-900 - [Python] Fix UnboundLocalError in ParquetDatasetPiece.read
  • ARROW-903 - [GLib] Remove a needless "."
  • ARROW-914 - [C++/Python] Fix Decimal ToBytes
  • ARROW-922 - Allow Flatbuffers and RapidJSON to be used locally on Windows
  • ARROW-927 - C++/Python: Add manylinux1 builds to Travis matrix
  • ARROW-928 - [C++] Detect supported MSVC versions
  • ARROW-933 - [Python] Remove debug print statement
  • ARROW-934 - [GLib] Glib sources missing from result of 02-source.sh
  • ARROW-936 - add missing file; revert tag change
  • ARROW-936 - fix release README
  • ARROW-938 - Fix Rat license warnings

New Features and Improvements

  • ARROW-6 - Hope to add development document
  • ARROW-39 - C++: Logical chunked arrays / columns: conforming to fixed chunk sizes
  • ARROW-52 - Set up project blog
  • ARROW-95 - Add Jekyll-based website publishing toolchain, migrate existing arrow-site
  • ARROW-98 - Java: API documentation
  • ARROW-99 - C++: Explore if RapidCheck may be helpful for testing / worth adding to toolchain
  • ARROW-183 - C++: Add storage type to DecimalType
  • ARROW-231 - [C++] : Add typed Resize to PoolBuffer
  • ARROW-281 - [C++] IPC/RPC support on Win32 platforms
  • ARROW-316 - [Format] Changes to Date metadata format per discussion in ARROW-316
  • ARROW-341 - [Python] Move pyarrow's C++ code to the main C++ source tree, install libarrow_python and headers
  • ARROW-452 - [C++/Python] Incorporate C++ and Python codebases for Feather file format
  • ARROW-459 - [C++] Dictionary IPC support in file and stream formats
  • ARROW-483 - [C++/Python] Provide access to "custom_metadata" Field attribute in IPC setting
  • ARROW-491 - [Format / C++] Add FixedWidthBinary type to format, C++ implementation
  • ARROW-492 - [C++] Add arrow/arrow.h public API
  • ARROW-493 - [C++] Permit large (length > INT32_MAX) arrays in memory
  • ARROW-502 - [C++/Python] : Logging memory pool
  • ARROW-510 - ARROW-582 ARROW-663 ARROW-729: [Java] Added units for Time and Date types, and integration tests
  • ARROW-518 - C++: Make Status::OK method constexpr
  • ARROW-520 - [C++] STL-compliant allocator
  • ARROW-528 - [Python] Utilize improved Parquet writer C++ API, add write_metadata function, test _metadata files
  • ARROW-534 - [C++] Add IPC tests for date/time after ARROW-452, fix bugs
  • ARROW-539 - [Python] Add support for reading partitioned Parquet files with Hive-like directory schemes
  • ARROW-542 - Adding dictionary encoding to FileWriter
  • ARROW-550 - [Format] Draft experimental Tensor flatbuffer message type
  • ARROW-552 - [Python] Implement getitem for DictionaryArray by returning a value from the dictionary
  • ARROW-557 - [Python] Add option to explicitly opt in to HDFS tests, do not implicitly skip
  • ARROW-563 - Support non-standard gcc version strings
  • ARROW-566 - Bundle Arrow libraries in Python package
  • ARROW-568 - [C++] Add default implementations for TypeVisitor, ArrayVisitor methods that return NotImplemented
  • ARROW-569 - [C++] Set version for *.pc
  • ARROW-574 - Python: Add support for nested Python lists in Pandas conversion
  • ARROW-576 - [C++] Complete file/stream implementation for union types
  • ARROW-577 - [C++] Use private implementation pattern in ipc::StreamWriter and ipc::FileWriter
  • ARROW-578 - [C++] Add -DARROW_CXXFLAGS=... option to make CMake more consistent
  • ARROW-580 - C++: Also provide jemalloc_X targets if only a static or shared version is found
  • ARROW-582 - [Java] Added JSON reader/writer unit test for date, time, and timestamp
  • ARROW-589 - C++: Use system provided shared jemalloc if static is unavailable
  • ARROW-591 - [C++] Add round trip testing fixture for JSON format
  • ARROW-593 - [C++] : Rename ReadableFileInterface to RandomAccessFile
  • ARROW-598 - [Python] Add support for converting pyarrow.Buffer to a memoryview with zero copy
  • ARROW-603 - [C++] Add RecordBatch::Validate method, call in RecordBatch ctor in debug builds
  • ARROW-605 - [C++] Refactor IPC adapter code into generic ArrayLoader class. Add Date32Type
  • ARROW-606 - [C++] upgrade flatbuffers version to 1.6.0
  • ARROW-608 - [Format] Days since epoch date type
  • ARROW-610 - [C++] Win32 compatibility in file.cc
  • ARROW-612 - [Java] Added not null to Field.toString output
  • ARROW-615 - [Java] Moved ByteArrayReadableSeekableByteChannel to src main o.a.a.vector.util
  • ARROW-616 - [C++] Do not include debug symbols in release builds by default
  • ARROW-618 - [Python/C++] Support timestamp+timezone conversion to pandas
  • ARROW-620 - [C++] Implement JSON integration test support for date, time, timestamp, fixed width binary
  • ARROW-621 - [C++] Start IPC benchmark suite for record batches, implement "inline" visitor. Code reorg
  • ARROW-625 - [C++] Add TimeUnit to TimeType::ToString. Add timezone to TimestampType::ToString if present
  • ARROW-626 - [Python] Replace PyBytesBuffer with zero-copy, memoryview-based PyBuffer
  • ARROW-631 - [GLib] Import
  • ARROW-632 - [Python] Add support for FixedWidthBinary type
  • ARROW-635 - [C++] Add JSON read/write support for FixedWidthBinary
  • ARROW-637 - [Format] Add timezone to Timestamp metadata, comments describing the semantics
  • ARROW-646 - [Python] Conda s3 robustness, set CONDA_PKGS_DIR env variable and add Travis CI caching
  • ARROW-647 - [C++] Use Boost shared libraries for tests and utilities
  • ARROW-648 - [C++] Support multiarch on Debian
  • ARROW-650 - [GLib] Follow ReadableFileInterface -> RnadomAccessFile change
  • ARROW-651 - [C++] Set version to shared library
  • ARROW-655 - [C++/Python] Implement DecimalArray
  • ARROW-656 - [C++] Add random access writer for a mutable buffer. Rename WriteableFileInterface to WriteableFile for better consistency
  • ARROW-657 - [C++/Python] Expose Tensor IPC in Python. Add equals method. Add pyarrow.create_memory_map/memory_map functions
  • ARROW-658 - [C++] Implement a prototype in-memory arrow::Tensor type
  • ARROW-659 - [C++] Add multithreaded memcpy implementation
  • ARROW-660 - [C++] Restore function that can read a complete encapsulated record batch message
  • ARROW-661 - [C++] Add LargeRecordBatch metadata type, IPC support, associated refactoring
  • ARROW-662 - [Format] Move Schema flatbuffers into their own file that can be included
  • ARROW-663 - [Java] Support additional Time metadata + vector value accessors
  • ARROW-664 - [C++] Make C++ Arrow serialization deterministic
  • ARROW-669 - [Python] Attach proper tzinfo when computing boxed scalars for TimestampArray
  • ARROW-670 - Arrow 0.3 release
  • ARROW-672 - [Format] Add MetadataVersion::V3 for Arrow 0.3
  • ARROW-674 - [Java] Support additional Timestamp timezone metadata
  • ARROW-675 - [GLib] Update package metadata
  • ARROW-676 - move from MinorType to FieldType in ValueVectors to carry all the relevant type bits
  • ARROW-679 - [Format] Change FieldNode, RecordBatch lengths to long, remove LargeRecordBatch. Refactoring
  • ARROW-681 - [C++] Disable boost's autolinking if shared boost is used …
  • ARROW-684 - [Python] More helpful error message if libparquet_arrow not built
  • ARROW-687 - [C++] Build and run full test suite in Appveyor
  • ARROW-688 - [C++] Use CMAKE_INSTALL_INCLUDEDIR for consistency
  • ARROW-690 - Only send JIRA updates to issues@arrow.apache.org
  • ARROW-698 - Add flag to FileWriter::WriteRecordBatch for writing record batches with lengths over INT32_MAX
  • ARROW-700 - Add headroom interface for allocator
  • ARROW-701 - [Java] Support Additional Date Type Metadata
  • ARROW-706 - [GLib] Add package install document
  • ARROW-707 - [Python] Return NullArray for array of all None in Array.from_pandas. Revert from_numpy -> from_pandas
  • ARROW-708 - [C++] Simplify metadata APIs to all use the Message class, perf analysis
  • ARROW-710 - [Python] Read/write with file-like Python objects from read_feather/write_feather
  • ARROW-711 - [C++] Remove extern template declarations for NumericArray<T> types
  • ARROW-712 - [C++] Reimplement Array::Accept as inline visitor
  • ARROW-717 - [C++] Implement IPC zero-copy round trip for tensors
  • ARROW-718 - [Python] Implement pyarrow.Tensor container, zero-copy NumPy roundtrips
  • ARROW-719 - [GLib] Release source archive
  • ARROW-722 - [Python] Support additional date/time types and metadata, conversion to/from NumPy and pandas.DataFrame
  • ARROW-724 - Add How to Contribute section to README
  • ARROW-725 - [Formats/Java] FixedSizeList message and java implementation
  • ARROW-727 - [Python] Ensure that NativeFile.write accepts any bytes, unicode, or object providing buffer protocol. Rename build_arrow_buffer to pyarrow.frombuffer
  • ARROW-728 - [C++/Python] Add Table::RemoveColumn method, remove name member, some other code cleaning
  • ARROW-729 - [Java] Add vector type for 32-bit date as days since UNIX epoch
  • ARROW-731 - [C++] Add shared library related versions to .pc
  • ARROW-733 - [C++/Python] Rename FixedWidthBinary to FixedSizeBinary for consistency with FixedSizeList
  • ARROW-734 - [C++/Python] Support building PyArrow on MSVC
  • ARROW-735 - [C++] Developer instruction document for MSVC on Windows
  • ARROW-737 - [C++] Enable mutable buffer slices, SliceMutableBuffer function
  • ARROW-741 - [Python] Switch Travis CI to use Python 3.6 instead of 3.5
  • ARROW-743 - [C++] Consolidate all but decimal array tests into array-test, collect some tests in type-test.cc
  • ARROW-744 - [GLib] Re-add an assertion for garrow_table_new() test
  • ARROW-745 - [C++] Allow use of system cpplint
  • ARROW-746 - [GLib] Add garrow_array_get_data_type()
  • ARROW-748 - [Python] Pin runtime library versions in conda-forge packages to force upgrades
  • ARROW-751 - [Python] Make all Cython modules private. Some code tidying
  • ARROW-752 - [Python] Support boxed Arrow arrays as input to DictionaryArray.from_arrays
  • ARROW-754 - [GLib] Add garrow_array_is_null()
  • ARROW-755 - [GLib] Add garrow_array_get_value_type()
  • ARROW-758 - [C++] Build with /WX in Appveyor, fix MSVC compiler warnings
  • ARROW-761 - [C++/Python] Add GetTensorSize method, Python bindings
  • ARROW-763 - C++: Use to find libpythonX.X.dylib
  • ARROW-765 - [Python] Add more natural Exception type hierarchy for thirdparty users
  • ARROW-768 - [Java] Change the "boxed" object representation of date and time types
  • ARROW-769 - [GLib] Support building without installed Arrow C++
  • ARROW-770 - [C++] Move .clang* files back into cpp source tree
  • ARROW-771 - [Python] Add read_row_group / num_row_groups to ParquetFile
  • ARROW-773 - [CPP] Add Table::AddColumn API
  • ARROW-774 - [GLib] Remove needless LICENSE.txt copy
  • ARROW-775 - add simple constructors to value vectors
  • ARROW-779 - [C++] Check for old metadata and raise exception if found
  • ARROW-782 - [C++] API cleanup, change public member access in DataType classes to functions, use class instead of struct
  • ARROW-788 - [C++] Align WriteTensor message
  • ARROW-795 - [C++] Consolidate arrow/arrow_io/arrow_ipc into a single shared and static library
  • ARROW-798 - [Docs] Publish Format Markdown documents somehow on arrow.apache.org
  • ARROW-802 - [GLib] Add read examples
  • ARROW-803 - [GLib] Update package repository URL
  • ARROW-804 - [GLib] Update build document
  • ARROW-806 - [GLib] Support add/remove a column from table
  • ARROW-807 - [GLib] Update "Since" tag
  • ARROW-808 - [GLib] Remove needless ignore entries
  • ARROW-810 - [GLib] Remove io/ipc prefix
  • ARROW-811 - [GLib] Add GArrowBuffer
  • ARROW-815 - [Java] Exposing reAlloc for ValueVector
  • ARROW-816 - [C++] Travis CI script cleanup, add C++ toolchain env with Flatbuffers, RapidJSON
  • ARROW-818 - [Python] Expand Sphinx API docs, pyarrow.* namespace. Add factory functions for time32, time64
  • ARROW-820 - [C++] Build dependencies for Parquet library without arrow…
  • ARROW-825 - [Python] Rename pyarrow.from_pylist to pyarrow.array, test on tuples
  • ARROW-827 - [Python] Miscellaneous improvements to help with Dask support
  • ARROW-828 - [C++] Add new dependency to README
  • ARROW-831 - Switch from boost::regex to std::regex
  • ARROW-832 - [C++] Update to gtest 1.8.0, remove now unneeded test_main.cc
  • ARROW-833 - [Python] Add Developer quickstart for conda users
  • ARROW-841 - [Python] Add pyarrow build to Appveyor
  • ARROW-844 - [Format] Update README documents in format/
  • ARROW-845 - [Python] Sync changes from PARQUET-955; explicit ARROW_HOME will override pkgconfig
  • ARROW-846 - [GLib] Add GArrowTensor, GArrowInt8Tensor and GArrowUInt8Tensor
  • ARROW-848 - [Python] Another pass on conda dev guide
  • ARROW-849 - [C++] Support setting production build dependencies with ARROW_BUILD_TOOLCHAIN
  • ARROW-857 - [Python] Automate publishing Python documentation to arrow-site
  • ARROW-859 - [C++] Do not build unit tests by default?
  • ARROW-860 - [C++] Remove typed Tensor containers
  • ARROW-861 - [Python] Move DEVELOPMENT.md to Sphinx docs
  • ARROW-862 - [Python] Simplify README landing documentation to direct users and developers toward the documentation
  • ARROW-863 - [GLib] Use GBytes to implement zero-copy
  • ARROW-864 - [GLib] Unify Array files
  • ARROW-865 - [Python] Add unit tests validating Parquet date/time type roundtrips
  • ARROW-868 - [GLib] Use GBytes to reduce copy
  • ARROW-869 - [JS] Rename directory to js/
  • ARROW-871 - [GLib] Unify DataType files
  • ARROW-876 - [GLib] Unify ArrayBuilder files
  • ARROW-877 - [GLib] Add garrow_array_get_null_bitmap()
  • ARROW-878 - [GLib] Add garrow_binary_array_get_buffer()
  • ARROW-880 - [GLib] Support getting raw data of primitive arrays
  • ARROW-890 - [GLib] Add GArrowMutableBuffer
  • ARROW-892 - [GLib] Fix GArrowTensor document
  • ARROW-893 - Add GLib document to Web site
  • ARROW-894 - [GLib] Add GArrowResizableBuffer and GArrowPoolBuffer
  • ARROW-896 - Support Jupyter Notebook in Web site
  • ARROW-898 - [C++/Python] Use shared_ptr to avoid copying KeyValueMetadata, add to Field type also
  • ARROW-904 - [GLib] Simplify error check codes
  • ARROW-907 - C++: Construct Table from schema and arrays
  • ARROW-908 - [GLib] Unify OutputStream files
  • ARROW-910 - [C++] Write 0 length at EOS in StreamWriter
  • ARROW-916 - [GLib] Add GArrowBufferOutputStream
  • ARROW-917 - [GLib] Add GArrowBufferReader
  • ARROW-918 - [GLib] Use GArrowBuffer for read buffer
  • ARROW-919 - [GLib] Use "id" to get type enum value from GArrowDataType
  • ARROW-920 - [GLib] Add Lua examples
  • ARROW-925 - [GLib] Fix GArrowBufferReader test
  • ARROW-926 - Add wesm to KEYS
  • ARROW-930 - javadoc generation fails with java 8
  • ARROW-931 - [GLib] Reconstruct input stream
  • ARROW-965 - Website updates for 0.3.0 release
SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc