Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

@apache-arrow/ts

Package Overview
Dependencies
Maintainers
5
Versions
38
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

@apache-arrow/ts - npm Package Versions

123

0.14.1

Diff

Changelog

Source

Apache Arrow 0.14.1 (2019-07-22)

Bug Fixes

  • ARROW-5775 - [C++] Fix thread-unsafe cached data
  • ARROW-5790 - [Python] Raise error when trying to convert 0-dim array in pa.array
  • ARROW-5791 - [C++] Fix infinite loop with more the 32768 columns.
  • ARROW-5816 - [Release] Do not curl in background in verify-release-candidate.sh
  • ARROW-5836 - [Java][FlightRPC] Skip Flight domain socket test when path too long
  • ARROW-5838 - [C++] Delegate OPENSSL_ROOT_DIR to bundled gRPC
  • ARROW-5849 - [C++] Fix compiler warnings on mingw32
  • ARROW-5850 - [CI][R] R appveyor job is broken after release
  • ARROW-5851 - [C++] Fix compilation of reference benchmarks
  • ARROW-5856 - [Python][Packaging] Fix use of C++ / Cython API from wheels
  • ARROW-5863 - [Python] Use atexit module for extension type finalization to avoid segfault
  • ARROW-5868 - [Python] Correctly remove liblz4 shared libraries from manylinux2010 image so lz4 is statically linked
  • ARROW-5873 - [Python] Guard for passed None in Schema.equals
  • ARROW-5874 - [Python] Fix macOS wheels to depend on system or Homebrew OpenSSL
  • ARROW-5878 - [C++][Parquet] Restore pre-0.14.0 Parquet forward compatibility by adding option to unconditionally set TIMESTAMP_MICROS/TIMESTAMP_MILLIS ConvertedType
  • ARROW-5886 - [Python][Packaging] Manylinux1/2010 compliance issue with libz
  • ARROW-5887 - [C#] ArrowStreamWriter writes FieldNodes in wrong order
  • ARROW-5889 - [C++][Parquet] Add property to indicate origin from converted type to TimestampLogicalType
  • ARROW-5899 - [Python][Packaging] Build and link uriparser statically in Windows wheel builds
  • ARROW-5921 - [C++] Fix multiple nullptr related crashes in IPC
  • PARQUET-1623 - [C++] Fix invalid memory access encountered when reading some parquet files

New Features and Improvements

  • ARROW-5101 - [Packaging] Avoid bundling static libraries in Windows conda packages
  • ARROW-5380 - [C++] Fix memory alignment UBSan errors.
  • ARROW-5564 - [C++] Use uriparser from conda-forge
  • ARROW-5609 - [C++] Set CMP0068 CMake policy to avoid macOS warnings
  • ARROW-5784 - [Release][GLib] Replace c_glib/ after running c_glib/autogen.sh in dev/release/02-source.sh
  • ARROW-5785 - [Rust] Make the datafusion cli dependencies optional
  • ARROW-5787 - [Release][Rust] Use local modules to verify RC
  • ARROW-5793 - [Release] Avoid duplicated known host SSH error in dev/release/03-binary.sh
  • ARROW-5794 - [Release] Skip uploading already uploaded binaries
  • ARROW-5795 - [Release] Add missing waits on uploading binaries
  • ARROW-5796 - [Release][APT] Update expected package list
  • ARROW-5797 - [Release][APT] Update supported distributions
  • ARROW-5820 - [Release] Remove undefined variable check from verify script
  • ARROW-5827 - [C++] Require c-ares CMake config
  • ARROW-5828 - [C++] Add required Protocol Buffers versions check
  • ARROW-5866 - [C++] Remove duplicate library in cpp/Brewfile
  • ARROW-5877 - [FlightRPC] Fix Python<->Java auth issues
  • ARROW-5904 - [Java][Plasma] Fix compilation of Plasma Java client
  • ARROW-5908 - [C#] ArrowStreamWriter doesn't align buffers to 8 bytes
  • ARROW-5934 - [Python] Bundle arrow's LICENSE with the wheels
  • ARROW-5937 - [Release] Stop parallel binary upload
  • ARROW-5938 - [Release] Create branch for adding release note automatically
  • ARROW-5939 - [Release] Add support for generating vote email template separately
  • ARROW-5940 - [Release] Add support for re-uploading sign/checksum for binary artifacts
  • ARROW-5941 - [Release] Avoid re-uploading already uploaded binary artifacts
  • ARROW-5958 - [Python] Link zlib statically in the wheels
kou
published 0.14.0 •

Changelog

Source

Apache Arrow 0.14.0 (2019-07-04)

New Features and Improvements

  • ARROW-258 - [Format] clarify definition of Buffer in context of RPC, IPC, File
  • ARROW-653 - [Python / C++] Add debugging function to print an array's buffer contents in hexadecimal
  • ARROW-767 - [C++] Filesystem abstraction
  • ARROW-835 - [Format][C++][Java] Create a new Duration type
  • ARROW-840 - [Python] Expose extension types
  • ARROW-973 - [Website] Add FAQ page
  • ARROW-1012 - [C++] Configurable batch size for parquet RecordBatchReader
  • ARROW-1207 - [C++] Implement MapArray, MapBuilder, MapType classes, and IPC support
  • ARROW-1261 - [Java] Add MapVector with reader and writer
  • ARROW-1278 - [Integration] Adding integration tests for fixed_size_list
  • ARROW-1279 - [Integration] Enable MapType integration tests
  • ARROW-1280 - [C++] add fixed size list type
  • ARROW-1349 - [Packaging] Provide APT and Yum repositories
  • ARROW-1496 - [JS] Upload coverage data to codecov.io
  • ARROW-1558 - [C++] Implement boolean filter (selection) kernel, rename comparison kernel-related functions
  • ARROW-1587 - [Format] Add metadata for user-defined logical types
  • ARROW-1774 - [C++] Add Array::View()
  • ARROW-1833 - [Java] Add accessor methods for data buffers that skip null checking
  • ARROW-1957 - [Python] Write nanosecond timestamps using new NANO LogicalType Parquet unit
  • ARROW-1983 - [C++][Parquet] Add AppendRowGroups and WriteMetaDataFile methods
  • ARROW-2057 - [Python] Expose option to configure data page size threshold in parquet.write_table
  • ARROW-2102 - [C++] Implement Take kernel
  • ARROW-2103 - [C++] Implement take kernel functions - string/binary value type
  • ARROW-2104 - [C++] take kernel functions for nested types
  • ARROW-2105 - [C++] Implement take kernel functions - properly handle special indices
  • ARROW-2186 - [C++] Clean up architecture specific compiler flags
  • ARROW-2217 - [C++] Add option to use dynamic linking for compression library dependencies
  • ARROW-2298 - [Python] Add unit tests to assert that float64 with NaN values can be safely coerced to integer types when converting from pandas
  • ARROW-2412 - [Integration] Add nested dictionary test case, skipped for now
  • ARROW-2467 - [Rust] Add generated IPC code
  • ARROW-2517 - [Java] Add list<decimal> writer
  • ARROW-2618 - [Rust] Bitmap constructor should accept for flag for default state (0 or 1)
  • ARROW-2667 - [C++/Python] Add pandas-like take method to Array
  • ARROW-2707 - [C++] Add Table::Slice
  • ARROW-2709 - [Python] write_to_dataset poor performance when splitting
  • ARROW-2730 - [C++] Set up CMAKE_C_FLAGS more thoughtfully instead of using CMAKE_CXX_FLAGS
  • ARROW-2796 - [C++] Simplify version script used for linking
  • ARROW-2818 - [Python] Better error message when trying to convert sparse pandas data to arrow Table
  • ARROW-2835 - [C++] Make file position undefined after ReadAt()
  • ARROW-2969 - [R] Convert between StructArray and "nested" data.frame column containing data frame in each cell
  • ARROW-2981 - [C++] improve clang-tidy usability
  • ARROW-2984 - [JS] Refactor release verification script to share code with main source release verification script
  • ARROW-3040 - [Go] add support for comparing Arrays
  • ARROW-3041 - [Go] add support for TimeArray
  • ARROW-3052 - [C++] Detect Apache ORC C++ libraries in system/conda toolchain, add to conda requirements
  • ARROW-3087 - [C++] Implement Compare filter kernel
  • ARROW-3144 - [C++/Python] Move "dictionary" member from DictionaryType to ArrayData to allow for variable dictionaries
  • ARROW-3150 - [Python] Enable Flight in Python wheels for Linux and Windows
  • ARROW-3166 - [C++] Consolidate IO interfaces used in arrow/io and parquet-cpp
  • ARROW-3191 - [Java] Make ArrowBuf work with arbitrary underlying memory
  • ARROW-3200 - [C++] Support dictionaries in Flight streams
  • ARROW-3290 - [C++] Toolchain support for secure gRPC
  • ARROW-3294 - [C++][Flight] Support Flight on Windows
  • ARROW-3314 - [R] Set -rpath using pkg-config when building
  • ARROW-3330 - [C++] Spawn multiple Flight performance servers in flight-benchmark to test parallel get performance
  • ARROW-3419 - [C++] Run include-what-you-use checks as nightly build
  • ARROW-3459 - [C++][Gandiva] Add support for variable length output vectors
  • ARROW-3475 - [C++] Allow builders to finish to the corresponding array type
  • ARROW-3570 - [Packaging] Don't bundle test data files with python wheels
  • ARROW-3572 - [Crossbow] Raise more helpful exception if Crossbow queue has an SSH origin URL
  • ARROW-3671 - [Go] implement MonthInterval and DayTimeInterval
  • ARROW-3676 - [Go] implement Decimal128 array
  • ARROW-3679 - [Go] implement read/write IPC for Decimal128
  • ARROW-3680 - [Go] implement Float16 array
  • ARROW-3686 - [Python] support masked arrays in pa.array
  • ARROW-3702 - [R] POSIXct mapped to DateType not TimestampType?
  • ARROW-3714 - [CI] Run RAT checks in pre-commit hooks
  • ARROW-3729 - [C++][Parquet] Use logical annotations in Arrow Parquet reader/writer
  • ARROW-3732 - [R] Add functions to write RecordBatch or Schema to Message value, then read back
  • ARROW-3758 - [R] Build R library and dependencies on Windows in Appveyor CI
  • ARROW-3759 - [R][CI] Build and test (no libarrow) on Windows in Appveyor
  • ARROW-3767 - [C++] Add cast from null to any other type
  • ARROW-3780 - [R] : Failed to fetch data: invalid data when collecting int16
  • ARROW-3791 - [C++ / Python] Add boolean type inference to the CSV parser
  • ARROW-3794 - [R] : Consider mapping INT8 to integer() not raw()
  • ARROW-3804 - [R] Support older versions of R runtime
  • ARROW-3810 - [R] type= argument for Array and ChunkedArray
  • ARROW-3811 - [R] : Support inferring data.frame column as StructArray in array constructors
  • ARROW-3814 - [R] RecordBatch$from_arrays()
  • ARROW-3815 - [R] : refine record batch factory
  • ARROW-3848 - [R] allow nbytes to be missing in RandomAccessFile$Read()
  • ARROW-3897 - [MATLAB] Add MATLAB support for writing numeric datatypes to a Feather file
  • ARROW-3904 - [C++/Python] Validate scale and precision of decimal128 type
  • ARROW-4013 - [Docs][C++] Add how to build on MSYS2
  • ARROW-4020 - [Release] Add a post release script to remove RC
  • ARROW-4047 - [Python] Document use of int96 timestamps and options in Parquet docs
  • ARROW-4086 - [Java] Add apis to debug memory alloc failures
  • ARROW-4121 - [C++] Refactor memory allocation from InvertKernel
  • ARROW-4159 - [C++] Build with -Wdocumentation when using clang and BUILD_WARNING_LEVEL=CHECKIN
  • ARROW-4194 - [Format][Docs] Remove duplicated / out-of-date logical type information from documentation
  • ARROW-4302 - [C++] Add OpenSSL to C++ build toolchain (#4384)
  • ARROW-4337 - [C#] Implemented Fluent API for building arrays and record batches
  • ARROW-4343 - [C++] Add docker-compose test for gcc 4.8 / Ubuntu 14.04 (Trusty), expand Xenial/16.04 Dockerfile to test Flight
  • ARROW-4356 - [CI] Add integration (docker) test for turbodbc
  • ARROW-4369 - [Packaging] Release verification script should test linux packages via docker
  • ARROW-4452 - [Python] Serialize sparse torch tensors
  • ARROW-4453 - [Python] Create Cython wrappers for SparseTensor
  • ARROW-4467 - [Rust][DataFusion] Create a REPL & Dockerfile for DataFusion
  • ARROW-4503 - [C#] Eliminate allocations in ArrowStreamReader when reading from a Stream
  • ARROW-4504 - [C++] Reduce number of C++ unit test executables from 128 to 82
  • ARROW-4505 - [C++] adding pretty print for dates, times, and timestamps
  • ARROW-4566 - [Flight] Add option to run Flight benchmark against separate server
  • ARROW-4596 - [Rust][DataFusion] Implement COUNT
  • ARROW-4622 - [C++][Python] MakeDense and MakeSparse in UnionArray should accept a vector of Field
  • ARROW-4625 - [Flight][Java] Add method to await Flight server termination in Java
  • ARROW-4626 - [Flight] Add application-defined metadata to DoGet/DoPut
  • ARROW-4627 - [Flight] Add application metadata field to DoPut
  • ARROW-4701 - [C++] Add JSON chunker benchmarks
  • ARROW-4702 - [C++] Update dependency versions
  • ARROW-4708 - [C++] add multithreaded json reader
  • ARROW-4708 - [C++] refactoring JSON parser to prepare for multithreaded impl
  • ARROW-4714 - [C++][JAVA] Providing JNI interface to Read ORC file via Arrow C++
  • ARROW-4717 - [C#] Consider exposing ValueTask instead of Task
  • ARROW-4719 - [C#] Implement ChunkedArray, Column and Table in C#
  • ARROW-4741 - [Java] Add missing type javadoc and enable checkstyle
  • ARROW-4787 - [C++] Add support for Null in MemoTable and related kernels
  • ARROW-4788 - [C++] Less verbose API for constructing StructArray
  • ARROW-4800 - [C++] Introduce a Result<T> class
  • ARROW-4805 - [Rust] Write temporal arrays to CSV
  • ARROW-4806 - [Rust] Temporal array casts
  • ARROW-4824 - [Python] Fix error checking in read_csv()
  • ARROW-4827 - [C++] Implement benchmark comparison
  • ARROW-4847 - [Python] Add pyarrow.table factory function
  • ARROW-4904 - [C++] Move implementations in arrow/ipc/test-common.h into libarrow_testing
  • ARROW-4911 - [R] Progress towards completing windows support
  • ARROW-4912 - [C++] add method for easy renaming of a Table's columns
  • ARROW-4913 - [Java][Memory] Add additional methods for observing allocations.
  • ARROW-4945 - [Flight] Enable integration tests in Travis
  • ARROW-4956 - [C#] Allow ArrowBuffers to wrap external Memory
  • ARROW-4959 - [C++][Gandiva][Crossbow] Gandiva crossbow packaging changes.
  • ARROW-4968 - [Rust] Assert that struct array field types match data in…
  • ARROW-4971 - [Go] Add type equality test function
  • ARROW-4972 - [Go] implement ArrayEquals
  • ARROW-4973 - [Go] implement ArraySliceEqual
  • ARROW-4974 - [Go] implement ArrayApproxEqual
  • ARROW-4990 - [C++] Support Array-Array comparison
  • ARROW-4993 - [C++] Add simple build configuration summary
  • ARROW-5000 - [Python] Fix 'SO' DeprecationWarning in setup.py
  • ARROW-5007 - [C++] Remove DCHECK in intrinsic headers
  • ARROW-5020 - [CI] Split Gandiva-related packages into separate .yml file
  • ARROW-5027 - [Python] Python bindings for JSON reader
  • ARROW-5037 - [Rust] [DataFusion] Refactor aggregate module
  • ARROW-5038 - [Rust][DataFusion] Implement AVG aggregate function
  • ARROW-5039 - [Rust][DataFusion] Re-implement CAST support
  • ARROW-5040 - [C++] ArrayFromJSON can't parse Timestamp from strings
  • ARROW-5045 - [Rust] Code coverage silently failing in CI
  • ARROW-5053 - [Rust][DataFusion] Use ARROW_TEST_DATA env var
  • ARROW-5054 - [Release][Flight] Test Flight in Linux/macOS release verification scripts
  • ARROW-5056 - [Packaging] Adjust conda recipes to use ORC conda-forge package on unix systems
  • ARROW-5061 - [Release] Improve 03-binary performance
  • ARROW-5062 - [Java][FlightRPC] Shade com.google.guava usage in Flight
  • ARROW-5063 - [FlightRPC][Java] Test that Flight client connections are independent
  • ARROW-5064 - [Release] Pass PKG_CONFIG_PATH to glib in the verification script
  • ARROW-5066 - [Integration] Add flags to enable/disable implementations in integration/integration_test.py
  • ARROW-5071 - [Archery] Implement running benchmark suite
  • ARROW-5076 - [Release] Improve post binary upload performance
  • ARROW-5077 - [Rust] Change Cargo.toml to use release versions
  • ARROW-5078 - [Documentation] Sphinx is failed by RemovedInSphinx30Warning
  • ARROW-5079 - [Release] Add a script that releases C# package
  • ARROW-5080 - [Release] Add a script that releases Rust packages
  • ARROW-5081 - [C++] Use PATH_SUFFIXES when searching for dependencies
  • ARROW-5083 - [Developer] PR merge script improvements: set already-released Fix Version, display warning when no components set
  • ARROW-5088 - [C++] Only add -Werror in debug builds. Add C++ documentation about compiler warning levels
  • ARROW-5091 - [Flight] Rename FlightGetInfo message to FlightInfo
  • ARROW-5093 - [Packaging] Add support for selective binary upload
  • ARROW-5094 - [Packaging] Add APT/Yum verification scripts
  • ARROW-5102 - [C++] Reduce header dependencies
  • ARROW-5108 - [Go] implement reading primitive arrays from Arrow file
  • ARROW-5109 - [Go] implement reading binary/string arrays from Arrow file
  • ARROW-5110 - [Go] implement reading struct arrays from Arrow file
  • ARROW-5111 - [Go] implement reading list arrays from Arrow file
  • ARROW-5112 - [Go] implement writing IPC Arrow stream/file
  • ARROW-5113 - [C++] Fix DoPut with dictionary arrays, add tests
  • ARROW-5115 - [JS] Add Vector Builders and high-level stream primitives
  • ARROW-5116 - [Rust] move kernel related files under compute/kernels
  • ARROW-5124 - [C++] Add support for Parquet in MinGW build
  • ARROW-5126 - [Rust][Parquet] Convert parquet column desc to arrow data type
  • ARROW-5127 - [Rust][Parquet] Add page iterator.
  • ARROW-5136 - [Flight] Call options
  • ARROW-5137 - [Flight] Implement auth API
  • ARROW-5145 - [C++] More input validation in release mode
  • ARROW-5150 - [Ruby] Add Arrow::Table#raw_records
  • ARROW-5155 - [GLib][Ruby] Add support for building union arrays from data type
  • ARROW-5157 - [Website] Add MATLAB to powered by Apache Arrow website
  • ARROW-5162 - [Rust][Parquet] Rename mod reader to arrow.
  • ARROW-5163 - [Gandiva] Cast timestamp/date are incorrectly evaluating year 0097 to 1997
  • ARROW-5164 - [Gandiva][C++] Introduce murmur32 for 32 bit types.
  • ARROW-5165 - [Python] update dev installation docs for --build-type + validate in setup.py
  • ARROW-5168 - [GLib] Add garrow_array_take()
  • ARROW-5171 - [C++] Use LESS instead of LOWER in compare enum
  • ARROW-5172 - [Go] implement reading fixed-size binary arrays from Arrow file
  • ARROW-5178 - [Python] Add Table.from_pydict()
  • ARROW-5179 - [Python] Return plain dicts, not OrderedDict, on Python 3.7+
  • ARROW-5185 - [C++] Add support for Boost with CMake configuration file
  • ARROW-5187 - [Rust] Add ability to convert StructArray to RecordBatch
  • ARROW-5188 - [Rust] Add temporal types to struct builders
  • ARROW-5189 - [Rust][Parquet] Format / display individual fields within a parquet row
  • ARROW-5190 - [R] : Discussion: tibble dependency in R package
  • ARROW-5191 - [Rust] Expose CSV and JSON reader schemas
  • ARROW-5203 - [GLib] Add support for Compare filter
  • ARROW-5204 - [C++] Improve builder performance
  • ARROW-5212 - [Go] Support reserve for the data buffer in the BinaryBuilder
  • ARROW-5218 - [C++] Improve build when third-party library locations are specified
  • ARROW-5219 - [C++] Build protobuf_ep in parallel when using Ninja build
  • ARROW-5222 - [Python] Revise pyarrow installation instructions for macOS
  • ARROW-5225 - [Java] Improve performance of BaseValueVector#getValidityBufferSizeFromCount
  • ARROW-5226 - [Gandiva] Add cmp functions for decimals
  • ARROW-5238 - [Python] Convert arguments to pyarrow.dictionary
  • ARROW-5241 - [Python] expose option to disable writing statistics to parquet file
  • ARROW-5250 - [Java] Add javadoc comments to public methods, remove style check suppression.
  • ARROW-5252 - [C++] Use standard-compliant std::variant backport
  • ARROW-5256 - [C++] Add support for LLVM 7.1
  • ARROW-5257 - [Website] Update site to use "official" Apache Arrow logo, add clearly marked links to logo
  • ARROW-5258 - [C++/Python] Collect file metadata of dataset pieces
  • ARROW-5261 - [C++] Add missing scalar defintions for Intervals
  • ARROW-5262 - [Python] Fix typo
  • ARROW-5264 - [Java] Allow enabling/disabling boundary checking by environmental variable
  • ARROW-5266 - [Go] implement read/write IPC for Float16
  • ARROW-5268 - [GLib] Add GArrowJSONReader
  • ARROW-5269 - [C++][Archery] Mark relevant benchmarks as regression
  • ARROW-5275 - [C++] Generic filesystem tests
  • ARROW-5281 - [Rust] Extract DataPageBuilder to test common
  • ARROW-5284 - [Rust] Replace libc with std::alloc for memory allocation
  • ARROW-5286 - [Python] support struct type in from_pandas
  • ARROW-5288 - [Documentation] Enhance the contribution guidelines page
  • ARROW-5289 - [C++] Move arrow/util/concatenate* to arrow/array
  • ARROW-5290 - [Java] Provide a flag to enable/disable null-checking in vector's get methods
  • ARROW-5291 - [Python] Add wrapper for take kernel on Array
  • ARROW-5298 - [Rust] Add debug implementation for buffer data.
  • ARROW-5299 - [C++] ListArray comparison is incorrect
  • ARROW-5309 - [Python] clarify that Schema.append returns new object
  • ARROW-5311 - [C++] use more specific error status types in take
  • ARROW-5313 - [Format] Comments on Field table are a bit confusing
  • ARROW-5317 - [Rust][Parquet] impl IntoIterator for SerializedFileReader
  • ARROW-5319 - [C++][CI][travis skip]
  • ARROW-5321 - [Gandiva][C++] add isnull impl for string types
  • ARROW-5323 - [CI][skip travis]
  • ARROW-5328 - [R] Add shell scripts to do a full package rebuild and test locally
  • ARROW-5329 - [MATLAB] Add support for building MATLAB interface to Feather directly within MATLAB
  • ARROW-5334 - [C++] Ensure all type classes end with "Type"
  • ARROW-5335 - [Python] Raise exception on variable dictionaries in conversion to Python/pandas
  • ARROW-5339 - [C++] Add jemalloc URL to thirdparty/versions.txt so download_dependencies.sh gets it
  • ARROW-5341 - [C++][Documentation] developers/cpp.rst should mention documentation warnings
  • ARROW-5342 - [Format] Formalize "extension types" in Arrow protocol metadata
  • ARROW-5346 - [C++] Revert changed to vendored datetime library
  • ARROW-5349 - [C++][Parquet] Add method to set file path in a parquet::FileMetaData instance
  • ARROW-5361 - [R] Follow DictionaryType/DictionaryArray changes from ARROW-3144
  • ARROW-5363 - [GLib] Fix coding styles
  • ARROW-5364 - [C++] Use ASCII rather than UTF-8 in BuildUtils.cmake comment
  • ARROW-5365 - [C++][CI] Enable ASAN/UBSAN in CI
  • ARROW-5368 - [C++] Disable jemalloc by default with MinGW
  • ARROW-5369 - [C++] Add support for glog on Windows
  • ARROW-5370 - [C++] Use system uriparser if available
  • ARROW-5372 - [GLib] Add support for null/boolean values CSV read option
  • ARROW-5378 - [C++] Local filesystem implementation
  • ARROW-5384 - [Go] implement FixedSizeList array
  • ARROW-5389 - [C++] Add Temporary Directory facility
  • ARROW-5392 - [C++][CI] Disable static build with MinGW on AppVeyor
  • ARROW-5393 - [R] Add tests and example for read_parquet()
  • ARROW-5395 - [C++] Utilize stream EOS in File format
  • ARROW-5396 - [JS] Support files and streams with no record batches
  • ARROW-5401 - [CI][skip appveyor]
  • ARROW-5404 - [C++] force usage of nonstd::sv_lite::string_view instead of std::string_view
  • ARROW-5407 - [C++] Allow building only integration test targets
  • ARROW-5413 - [C++] Skip UTF8 BOM in CSV files
  • ARROW-5415 - [Release] Release script should update R version everywhere
  • ARROW-5416 - [Website] Add Homebrew to project installation page
  • ARROW-5418 - [CI][R] Run code coverage and report to codecov.io
  • ARROW-5420 - [Java] Implement or remove getCurrentSizeInBytes in Variab…
  • ARROW-5427 - [Python] pandas conversion preserve_index=True to force RangeIndex serialization
  • ARROW-5428 - [C++] Add option to set "read extent" in arrow::io::BufferedInputStream
  • ARROW-5429 - [Java] Provide alternative buffer allocation policy
  • ARROW-5432 - [Python] Add NativeFile.read_at()
  • ARROW-5433 - [C++][Parquet] Improve parquet-reader columns information, strip trailing whitespace from test case
  • ARROW-5434 - [Memory][Java] Introduce wrappers for backward compatibility.
  • ARROW-5436 - [Python] parquet.read_table add filters keyword
  • ARROW-5438 - [JS] EOS bytes for sequential readers
  • ARROW-5441 - [C++] Implement FindArrowFlight.cmake
  • ARROW-5442 - [Website] Clarify what makes a release artifact "official"
  • ARROW-5443 - [Crossbow] Turn parquet build off for Gandiva.
  • ARROW-5447 - [Ruby] Ensure flushing test gz file
  • ARROW-5449 - [C++] Test extended-length paths on Windows
  • ARROW-5451 - [C++][Gandiva] Support cast/round functions for decimal
  • ARROW-5452 - [R] Add API documentation website (pkgdown)
  • ARROW-5461 - [Java] Add micro-benchmarks for Float8Vector and allocators
  • ARROW-5463 - [Rust] Add AsRef trait for Buffer.
  • ARROW-5464 - [Archery] Fix default diff --benchmark-filter
  • ARROW-5465 - [Crossbow] Support writing submitted job definition yaml to a file
  • ARROW-5466 - [Java] Dockerize Java builds in Travis CI, run multiple JDKs in single entry
  • ARROW-5467 - [Go] implement read/write IPC for Time32/64 arrays
  • ARROW-5468 - [Go] implement read/write IPC for Timestamp arrays
  • ARROW-5469 - [Go] implement read/write IPC for Date32/64 arrays
  • ARROW-5470 - [CI] Fix Travis-CI R job that broke with the local fs patch
  • ARROW-5472 - [Development] Add warning to PR merge tool if no JIRA component is set
  • ARROW-5474 - [C++] Document Boost 1.58 as minimum supported version, add docker-compose entry for it, fix broken cpp/Dockerfile* builds
  • ARROW-5475 - [Python] Add Python binding for arrow::Concatenate
  • ARROW-5476 - [Java][Memory] Fix Netty Arrow Buf.
  • ARROW-5477 - [C++] Check required RapidJSON version
  • ARROW-5478 - [Packaging] Drop Ubuntu 14.04 support
  • ARROW-5481 - [GLib] Add "error" parameter document
  • ARROW-5485 - [C++] Install libraries from googletest_ep into build output directory on non-Windows platforms.
  • ARROW-5485 - [Crossbow] Disable unit tests in Gandiva macOS crossbow job until underlying issue resolved
  • ARROW-5486 - [GLib] Add binding of gandiva::FunctionRegistry and related things
  • ARROW-5488 - [R] Workaround when C++ lib not available
  • ARROW-5490 - [C++] Remove ARROW_BOOST_HEADER_ONLY
  • ARROW-5491 - [C++] Remove unecessary semicolons following MACRO definitions
  • ARROW-5492 - [R] Add "col_select" argument to read_* functions to read subset of columns
  • ARROW-5495 - [C++] Update some dependency URLs from http to https
  • ARROW-5496 - [R][CI] Fix relative paths in R codecov.io reporting
  • ARROW-5498 - [C++][CI] Fix Flatbuffers related error with MinGW
  • ARROW-5499 - [R] Alternate bindings for when libarrow is not found
  • ARROW-5500 - [R] read_csv_arrow() signature should match readr::read_csv()
  • ARROW-5503 - [R] : add read_json()
  • ARROW-5504 - [R] : move use_threads argument to global option
  • ARROW-5509 - [R] Add basic write_parquet
  • ARROW-5511 - [Packaging] Enable Flight in Conda packages
  • ARROW-5512 - [C++] Rough API skeleton for C++ Datasets API / framework
  • ARROW-5513 - [Java] Refactor method name for getstartOffset to use camel case
  • ARROW-5516 - [Python][Documentation] Development page for pyarrow has a missing dependency in using pip
  • ARROW-5518 - [Java] Set VectorSchemaRoot rowCount to 0 on allocateNew and clear
  • ARROW-5524 - [C++] Turn off PARQUET_BUILD_ENCRYPTION in CMake if OpenSSL not found (#4494)
  • ARROW-5526 - [GitHub] Add more prominent notice to ISSUE_TEMPLATE.md to direct bug reports to JIRA
  • ARROW-5529 - [Flight] Allow serving with multiple TLS certificates
  • ARROW-5531 - [Python] Implement Array.from_buffers for varbinary and nested types, add DataType.num_buffers property
  • ARROW-5533 - [C++][Plasma] make plasma client thread safe
  • ARROW-5534 - [GLib] Add garrow_table_concatenate()
  • ARROW-5535 - [GLib] Add garrow_table_slice()
  • ARROW-5537 - [JS] Support delta dictionaries in RecordBatchWriter and DictionaryBuilder
  • ARROW-5538 - [C++] Restrict minimum OpenSSL version to 1.0.2
  • ARROW-5541 - [R] : cast from negative int32 to uint32 and uint64 are now safe
  • ARROW-5544 - [Archery] Don't return non-zero on regressions
  • ARROW-5545 - [C++][Docs] Clarify expectation of UTC values for timestamps with time zones
  • ARROW-5547 - [C++][FlightRPC] Support pkg-config for Arrow Flight
  • ARROW-5552 - [Go] make Schema, Field and simpleRecord implement Stringer
  • ARROW-5554 - [Python] Added a python wrapper for arrow::Concatenate()
  • ARROW-5555 - [R] Add install_arrow() function to assist the user in obtaining C++ runtime libraries
  • ARROW-5556 - [Doc][Python] Document JSON reader
  • ARROW-5557 - [C++] Add VisitBits benchmark
  • ARROW-5565 - [Python][Docs] Add instructions how to use gdb to debug C++ libraries when running Python unit tests
  • ARROW-5567 - [C++] Fix build error of memory-benchmark
  • ARROW-5571 - [R] Rework handing of ARROW_R_WITH_PARQUET
  • ARROW-5574 - [R] documentation error for read_arrow()
  • ARROW-5581 - [Java] Provide interfaces and initial implementations for vector sorting
  • ARROW-5582 - [Go] implement RecordEqual
  • ARROW-5586 - [R] convert Array of LIST type to R lists
  • ARROW-5587 - [Java] Add more style check rule for Java code
  • ARROW-5590 - [R] Run "no libarrow" R build in the same CI entry if possible
  • ARROW-5591 - [Go] implement read/write IPC for Duration & Intervals
  • ARROW-5597 - [Packaging] Add Flight deb packages
  • ARROW-5600 - [R] R package namespace cleanup
  • ARROW-5602 - [Java][Gandiva] Add tests for round/cast
  • ARROW-5604 - [Go] improve coverage of TypeTraits
  • ARROW-5609 - [C++] Set CMP0068 CMake policy to avoid macOS warnings
  • ARROW-5612 - [Python][Doc] Add prominent note that date_as_object option changed with Arrow 0.13
  • ARROW-5621 - [Go] implement read/write IPC for Decimal128 arrays
  • ARROW-5622 - [C++][Dataset] Support pkg-config for Arrow Datasets
  • ARROW-5625 - [R] convert Array of struct type to data frame columns
  • ARROW-5632 - [Doc] Basic instructions for using Xcode with Arrow
  • ARROW-5633 - [Python] Enable bz2 in Linux wheels
  • ARROW-5635 - [C++] Added a Compact() method to Table.
  • ARROW-5637 - [Java][C++][Gandiva] Complete In Expression Support
  • ARROW-5639 - [Java] Remove floating point computation from getOffsetBufferValueCapacity
  • ARROW-5641 - [GLib] Remove enums files generated by GNU Autotools from Git targets
  • ARROW-5643 - [FlightRPC] Add ability to override SSL hostname checking
  • ARROW-5650 - [Python] Update manylinux dependency versions
  • ARROW-5652 - [CI] Fix lint docker image
  • ARROW-5653 - [CI] Fix cpp docker image
  • ARROW-5656 - [Python][Packaging] Fix macOS wheel builds, add Flight support
  • ARROW-5659 - [C++] Add support for finding OpenSSL installed by Homebrew
  • ARROW-5660 - [GLib][CI] Use Xcode 10.2
  • ARROW-5661 - [Gandiva][C++] support hash functions for decimals in gandiva
  • ARROW-5662 - [C++] Add support for BOOST_SOURCE=AUTO|BUNDLED|SYSTEM
  • ARROW-5663 - [Packaging][RPM] Update CentOS packages for 0.14.0
  • ARROW-5664 - [Crossbow] Execute nightly crossbow tests on CircleCI instead of Travis
  • ARROW-5668 - [C++/Python] Include 'not null' in schema fields pretty print
  • ARROW-5669 - [Python][Packaging] Add ARROW_TEST_DATA env variable to Crossbow Linux Wheel build
  • ARROW-5670 - [Crossbow] get_apache_mirror.py fails with TLS error on macOS with Python 3.5
  • ARROW-5671 - [crossbow] mac os python wheels failing
  • ARROW-5672 - [Java] Refactor redundant method modifier
  • ARROW-5683 - [R] Add snappy to Rtools Windows builds
  • ARROW-5684 - [Packaging][deb] Add support for Ubuntu 19.04
  • ARROW-5685 - [Packaging][deb] Add support for Apache Arrow Datasets
  • ARROW-5687 - [C++] Remove remaining uses of ARROW_BOOST_VENDORED
  • ARROW-5690 - [Packaging][Python] Fix macOS wheel building
  • ARROW-5694 - [Python] Support list of Decimals in conversion to pandas
  • ARROW-5695 - [C#][Release] Run sourcelink test in verify-release-candidate.sh
  • ARROW-5696 - [C++][Gandiva] Introduce castVarcharVarchar
  • ARROW-5699 - [C++] Optimize decimal128 parsing
  • ARROW-5701 - [C++][Gandiva] Build expr with specific sv
  • ARROW-5702 - [C++] parquet::arrow::FileReader::GetSchema()
  • ARROW-5704 - [C++] Stop using ARROW_TEMPLATE_EXPORT for SparseTensorImpl
  • ARROW-5705 - [Java] Optimize BaseValueVector#computeCombinedBufferSize logic
  • ARROW-5706 - [Java] Remove type conversion in getValidityBufferValueCapacity
  • ARROW-5707 - [Java] Improve the performance and code structure for ArrowRecordBatch
  • ARROW-5710 - [C++] Allow compiling Gandiva with Ninja on Windows
  • ARROW-5715 - [Release] Verify Ubuntu 19.04 APT repository
  • ARROW-5718 - [R] auto splice data frames in record_batch() and table()
  • ARROW-5720 - [C++] Create benchmarks for decimal related classes.
  • ARROW-5721 - [Rust] Move array related code into a separate module
  • ARROW-5724 - [R][CI] AppVeyor build should use ccache
  • ARROW-5725 - [Crossbow] Port conda recipes to azure pipelines
  • ARROW-5726 - [Java] Implement a common interface for int vectors
  • ARROW-5727 - [Python][CI] Install pytest-faulthandler before running tests
  • ARROW-5748 - [Packaging][deb] Add support for Debian GNU/Linux buster
  • ARROW-5749 - [Python] Added python binding for Table::CombineChunks
  • ARROW-5751 - [Python][Packaging] Ensure that c-ares is linked statically in Python wheels
  • ARROW-5752 - [Java] Improve the performance of ArrowBuf#setZero
  • ARROW-5755 - [Rust][Parquet] Derive clone for Type.
  • ARROW-5768 - [Release] Remove needless empty lines at the end of CHANGELOG.md
  • ARROW-5773 - [R] Clean up documentation before release
  • ARROW-5780 - [C++] Add benchmark for Decimal operations
  • ARROW-5782 - [Release] Setup test data for Flight in dev/release/01-perform.sh
  • ARROW-5783 - [Release][C#] Exclude dummy.git from RAT check
  • ARROW-5785 - [Rust] Rust datafusion implementation should not depend on rustyline
  • ARROW-5787 - [Release][Rust] Use local modules to verify RC
  • ARROW-5793 - [Release] Avoid duplicate known host SSH error in dev/release/03-binary.sh
  • ARROW-5794 - [Release] Skip uploading already uploaded binaries
  • ARROW-5795 - [Release] Add missing waits on uploading binaries
  • ARROW-5796 - [Release][APT] Update expected package list
  • ARROW-5797 - [Release][APT] Update supported distributions
  • ARROW-5818 - [Java][Gandiva] support varlen output vectors
  • ARROW-5820 - [Release] Remove undefined variable check from verify script
  • ARROW-5826 - [Website] Blog post for 0.14.0 release announcement
  • PARQUET-1243 - [C++] Throw more informative exception when reading a length-0 Parquet file
  • PARQUET-1411 - [C++] Add parameterized logical annotations to Parquet metadata
  • PARQUET-1422 - [C++] Use common Arrow IO interfaces throughout codebase
  • PARQUET-1517 - [C++] Crypto package updates to match the final spec
  • PARQUET-1523 - [C++] Vectorize Comparator interface, remove virtual calls on inner loop. Refactor Statistics to not require PARQUET_EXTERN_TEMPLATE
  • PARQUET-1569 - [C++] Consolidate shared unit testing header files
  • PARQUET-1582 - [C++] Add ToString method to ColumnDescriptor
  • PARQUET-1583 - [C++] Remove superfluous parquet::Vector class
  • PARQUET-1586 - [C++] Add --dump options to parquet-reader tool to dump def/rep levels
  • PARQUET-1603 - [C++] rename parquet::LogicalType to parquet::ConvertedType

Bug Fixes

  • ARROW-61 - [Java] Method can return the value bigger than long MAX_VALUE
  • ARROW-352 - [Format] Interval(DAY_TIME) has no unit
  • ARROW-1837 - [Java][Integration] Fix unsigned round trip integration tests
  • ARROW-2119 - [IntegrationTest] Add test case with a stream having no record batches
  • ARROW-2136 - [Python] Check null counts for non-nullable fields when converting from pandas.DataFrame with supplied schema
  • ARROW-2256 - [C++] Fix libfuzzer builds for clang-7
  • ARROW-2461 - [Python] Build manylinux2010 wheels
  • ARROW-2590 - [Python] Pyspark python_udf serialization error on grouped map (Amazon EMR)
  • ARROW-3344 - [Python] Disable flaky Plasma test
  • ARROW-3399 - [Python] Implementing numpy matrix serialization
  • ARROW-3650 - [Python] warn on converting DataFrame with mixed type column names
  • ARROW-3801 - [Python] Pandas-Arrow roundtrip makes pd categorical index not writeable
  • ARROW-4021 - [Ruby] Error building red-arrow on msys2
  • ARROW-4076 - [Python] Validate ParquetDataset schema after filtering
  • ARROW-4139 - [Python][Parquet] Wrap new parquet::LogicalType, cast min/max statistics based on LogicalType
  • ARROW-4301 - [Java] use arrow-jni profile for both gandiva/orc
  • ARROW-4301 - [Java][Gandiva] Update version manually
  • ARROW-4324 - [Python] Triage broken type inference logic in presence of a mix of NumPy dtype-having objects and other scalar values
  • ARROW-4350 - [Python] Fix conversion from Python to Arrow with nested lists and NumPy dtype=object items
  • ARROW-4433 - [R] Segmentation fault when instantiating arrow::table from data frame
  • ARROW-4447 - [C++] Investigate dynamic linking for libthift
  • ARROW-4516 - [Python] Error while creating a ParquetDataset on a path without `_common_dataset` but with an empty `_tempfile`
  • ARROW-4523 - [JS] Add row proxy generation benchmark
  • ARROW-4651 - [Flight] Use URIs instead of host/port pair
  • ARROW-4665 - [C++] With glog activated, DCHECK macros are redefined
  • ARROW-4675 - [Python] Fix pyarrow.deserialize failure when reading payload in Python 3 payload generated in Python 2
  • ARROW-4694 - [CI] Improve detect-changes.py on Travis PRs
  • ARROW-4723 - [Python] Ignore "hidden" files that starts with underscore
  • ARROW-4725 - [C++] Enable dictionary builder tests with MinGW build
  • ARROW-4823 - [C++][Python] Do not close raw file handle in ReadaheadSpooler, check that file handles passed to read_csv are not closed
  • ARROW-4832 - [Python] pandas Index metadata for RangeIndex is incorrect
  • ARROW-4845 - [R] Compiler warnings on Windows MingW64
  • ARROW-4851 - [Java] BoundsChecking.java defaulting behavior for old drill parameter seems off
  • ARROW-4877 - [Plasma] CI failure in test_plasma_list
  • ARROW-4884 - [C++] conda-forge thrift-cpp package not available via pkg-config or cmake
  • ARROW-4885 - [C++/Python] Enable Decimal parsing in CSV
  • ARROW-4886 - [Rust] Cast to list with offset
  • ARROW-4923 - [Java] Add methods to set long value at given index in DecimalVector
  • ARROW-4934 - [Python] Address deprecation notice that will be a bug in Python 3.8
  • ARROW-5019 - [C#] ArrowStreamWriter doesn't work on a non-seekable stream
  • ARROW-5049 - [Python] org/apache/hadoop/fs/FileSystem class not found when pyarrow FileSystem used in spark
  • ARROW-5051 - [GLib][Gandiva] Don't return temporary memory
  • ARROW-5055 - [Ruby][MSYS2] libparquet needs to be installed in MSYS2 for ruby
  • ARROW-5058 - [Release] Fix typos in vote e-mail template
  • ARROW-5059 - [C++][Gandiva] cbrt_* floating point tests can fail due to exact comparisons
  • ARROW-5065 - [Rust] cast kernel does not support casting from Int64
  • ARROW-5068 - [Gandiva][Packaging] Fix gandiva nightly builds after the CMake refactor
  • ARROW-5090 - Parquet linking fails on MacOS due to @rpath in dylib
  • ARROW-5092 - [C#] Create a dummy .git directory to download the source files from GitHub with Source Link
  • ARROW-5095 - [Flight][C++] Expose server error message in DoGet
  • ARROW-5096 - [Packaging][deb] Add missing plasma-store-server packages
  • ARROW-5097 - [Packaging][CentOS6] Remove needless dependencies
  • ARROW-5098 - [Website] Update how to install .deb by APT
  • ARROW-5100 - [JS] Remove swap while collapsing contiguous buffers
  • ARROW-5117 - [Go] fix panic when nil or empty slices are appended to builders
  • ARROW-5119 - [Go] fix Boolean stringer implementation
  • ARROW-5122 - [Python] pyarrow.parquet.read_table raises non-file path error when given a windows path to a directory
  • ARROW-5128 - [Packaging][CentOS][Conda] Numpy not found in nightly builds
  • ARROW-5129 - [Rust] Column writer bug: check dictionary encoder when adding a new data page
  • ARROW-5130 - [C++][Python] Limit exporting of std::* symbols
  • ARROW-5132 - [Java] Errors on building gandiva_jni.dll on Windows with Visual Studio 2017
  • ARROW-5138 - [Python] Add documentation about pandas preserve_index option
  • ARROW-5140 - [Bug?][Parquet] Can write a jagged array column of strings to disk, but hit `ArrowNotImplementedError` on read
  • ARROW-5142 - , ARROW-5732, ARROW-5735: [CI] Emergency fixes
  • ARROW-5144 - [Python] ParquetDataset and ParquetPiece not serializable
  • ARROW-5146 - [Dev] Fix project name inference in merge script
  • ARROW-5147 - [C++] Add missing dependencies to Brewfile
  • ARROW-5148 - [Gandiva] Allow linking with RTTI-disabled LLVM builds
  • ARROW-5149 - [Packaging][Wheel] Pin LLVM to version 7 in windows builds
  • ARROW-5152 - [Python] Fix CMake warnings
  • ARROW-5159 - [Rust] Unable to build benches in arrow crate.
  • ARROW-5160 - [C++] Don't evaluate expression twice in ABORT_NOT_OK
  • ARROW-5166 - [Python][Parquet] Statistics for uint64 columns may overflow
  • ARROW-5167 - [C++] Upgrade string-view-light to latest
  • ARROW-5169 - [Python] preserve field nullability of specified schema in Table.from_pandas
  • ARROW-5173 - [Go] handle multiple concatenated record batches
  • ARROW-5174 - [Go] implement Stringer for DataTypes
  • ARROW-5177 - [C++/Python] Check column index when reading Parquet column
  • ARROW-5183 - [CI] Fix AppVeyor failure
  • ARROW-5184 - [Rust] Broken links and other documentation warnings
  • ARROW-5186 - [Plasma] Fix crash caused by improper free on CUDA memory
  • ARROW-5194 - [C++][Plasma] TEST(PlasmaSerialization, GetReply) is failing
  • ARROW-5195 - [C++] Detect null strings in CSV string columns
  • ARROW-5201 - [Python] handle collections.abc deprecation warnings
  • ARROW-5208 - [Python] Add mask argument to pyarrow.infer_type, do not look at masked values when inferring output type in pyarrow.array
  • ARROW-5214 - [C++] Fix thirdparty download script
  • ARROW-5217 - [Rust][DataFusion] Fix failing tests
  • ARROW-5232 - [Java] Avoid runaway doubling of vector size
  • ARROW-5233 - [Go] Migrate to flatbuffers-v1.11.0
  • ARROW-5237 - [Python] populate _pandas_api.version
  • ARROW-5240 - [C++][CI] pin cmake_format
  • ARROW-5242 - [C++] Update vendored HowardHinnant/date to master
  • ARROW-5243 - [Java][Gandiva] Add decimal compare tests
  • ARROW-5245 - [CI][C++] Unpin cmake format (current version is 5.1)
  • ARROW-5246 - [Go] use Go-1.12.x in CI
  • ARROW-5249 - [Java] Add auth capability to Flight async operations (#4238)
  • ARROW-5253 - [C++] Fix snappy external build
  • ARROW-5254 - [Flight][Java] Change Flight doAction to allow multiple responses in Java
  • ARROW-5255 - [Java] Proof-of-concept of Java extension types
  • ARROW-5260 - [Python] Fix crash when deserializating from components in another process
  • ARROW-5274 - [JavaScript] Wrong array type for countBy
  • ARROW-5283 - [C++][Plasma] Erase object id in client when abort object
  • ARROW-5285 - [C++][Plasma] Implement to release GpuProcessHandle
  • ARROW-5293 - [C++] Take kernel on DictionaryArray does not preserve ordered flag
  • ARROW-5294 - [Python][CI] Fix manylinux1 build
  • ARROW-5296 - [Java] Ignore timeout-based Flight tests for now
  • ARROW-5301 - [Python] update parquet docs on multithreading
  • ARROW-5304 - [C++] fix thread-safe on CudaDeviceManager::GetInstance
  • ARROW-5306 - [CI][GLib] Disable GTK-Doc
  • ARROW-5308 - [Go] remove deprecated Feather format
  • ARROW-5314 - [Go] fix bug for String Arrays with offset
  • ARROW-5314 - [Go] Fix bug for FixedSizeBinary with offset
  • ARROW-5318 - [Python] pyarrow hdfs reader overrequests
  • ARROW-5325 - [Archery][Benchmark] Output properly formatted jsonlines from benchmark diff cli command
  • ARROW-5330 - [CI][skip appveyor]
  • ARROW-5332 - [R] Update R package README with richer installation instructions
  • ARROW-5348 - [Java][CI] Add missing gandiva javadoc
  • ARROW-5360 - [Rust] Update rustyline to fix build
  • ARROW-5362 - [C++] Fix compression test memory usage
  • ARROW-5371 - [Release] Add tests for dev/release/00-prepare.sh
  • ARROW-5373 - [Java] Add missing details for Gandiva Java Build
  • ARROW-5376 - [C++] Workaround for gcc 5.4.0 bug
  • ARROW-5383 - [Go] Update flatbuf for new Duration type
  • ARROW-5387 - [Go] properly handle sub-slice of List
  • ARROW-5388 - [Go] use arrow.TypeEquals in array.NewChunked
  • ARROW-5390 - [CI][skip appveyor]
  • ARROW-5397 - [FlightRPC] Add TLS certificates for testing Flight
  • ARROW-5398 - [Python] Fix Flight tests
  • ARROW-5403 - [C++] Use GTest shared libraries with BUNDLED build, always use BUNDLED with MSVC
  • ARROW-5411 - [C++][Python] Build error building on Mac OS Mojave
  • ARROW-5412 - [Integration] Add Java option for netty reflection
  • ARROW-5419 - [C++] Allow recognizing empty strings as null strings in CSV files
  • ARROW-5421 - [Packaging][Crossbow] Duplicated key in nightly test configuration
  • ARROW-5422 - [CI] [C++] Build failure with Google Benchmark
  • ARROW-5430 - [Python] Raise ArrowInvalid for pyints larger than int64
  • ARROW-5435 - [Java] Add test for IntervalYearVector#getAsStringBuilder
  • ARROW-5437 - [Python] Missing pandas pytest marker from parquet tests
  • ARROW-5446 - [C++][CMake] Install arrow/util/config.h into CMAKE_INSTALL_INCLUDEDIR
  • ARROW-5448 - [C++][CI][MinGW][skip travis]
  • ARROW-5453 - [C++] Update to cmake-format=0.5.2 and pin again
  • ARROW-5455 - [Rust] Build broken by 2019-05-30 Rust nightly
  • ARROW-5456 - [GLib][Plasma] Fix dependency order on building document
  • ARROW-5457 - [GLib][Plasma] Fix environment variable name for test
  • ARROW-5459 - [Go] implement Stringer for float16 DataType
  • ARROW-5462 - [Go] support writing zero-length List arrays
  • ARROW-5479 - [Rust][DataFusion] Use ARROW_TEST_DATA instead of relative path for testing
  • ARROW-5487 - [Docs] Fix Sphinx failure
  • ARROW-5493 - [Go][Integration] add Go support for IPC integration tests
  • ARROW-5507 - [Plasma][CUDA] Fix compile error
  • ARROW-5514 - [C++] Fix pretty-printing uint64 values
  • ARROW-5517 - [C++] Only check header basename for 'internal' when collecting public headers
  • ARROW-5520 - [Packaging][deb] Add support for building on arm64
  • ARROW-5521 - [Packaging] Use Apache RAT 0.13
  • ARROW-5528 - [C++] Fixed a bug when Concatenate() arrays with no value buffers.
  • ARROW-5532 - [JS] Field Metadata Not Read
  • ARROW-5551 - [Go] implement FixedSizeArrays with 2-buffers layout
  • ARROW-5553 - [Ruby] Use the official packages to install Apache Arrow
  • ARROW-5576 - [C++] Query ASF mirror system for URL and use when downloading Thrift
  • ARROW-5577 - [C++][Alpine] Correct googletest shared library paths on non-Windows to fix Alpine build
  • ARROW-5583 - [Java] When the isSet of a NullableValueHolder is 0, the buffer field should not be used
  • ARROW-5584 - [Java] Add import for link reference in FieldReader javadoc
  • ARROW-5589 - [C++] Add missing nullptr check during flatbuffer decoding
  • ARROW-5592 - [Go] implement Duration array
  • ARROW-5596 - [Python] Fix Python-3 syntax only in test_flight.py
  • ARROW-5601 - [C++][Gandiva] fail if the output type is not supported
  • ARROW-5603 - [Python] Register custom pytest markers to avoid warnings
  • ARROW-5605 - [C++] Verify Flatbuffer messages in more places to prevent crashes due to bad inputs
  • ARROW-5606 - [Python] deal with deprecated RangeIndex._start/_stop/_step
  • ARROW-5608 - [C++][parquet] Fix invalid memory access when using parquet::arrow::ColumnReader
  • ARROW-5615 - [C++] gcc 5.4.0 doesn't want to parse inline C++11 string R literal
  • ARROW-5616 - [C++][Python] Fix -Wwrite-strings warning when building against Python 2.7 headers
  • ARROW-5617 - [C++] thrift_ep 0.12.0 fails to build when using ARROW_BOOST_VENDORED=ON
  • ARROW-5619 - [C++] Make get_apache_mirror.py workable with Python 3.5
  • ARROW-5623 - [GLib][CI] Use system Meson on macOS
  • ARROW-5624 - [C++] Fix typo causing build failure when -Duriparser_SOURCE=BUNDLED
  • ARROW-5626 - [C++] Fix caching of expressions with decimals
  • ARROW-5629 - [C++] Fix Coverity issues
  • ARROW-5631 - [C++] Fix FindBoost targets with cmake3.2
  • ARROW-5644 - [Python] test_flight.py::test_tls_do_get appears to hang
  • ARROW-5647 - [Python] Accessing a file from Databricks using pandas read_parquet using the pyarrow engine fails with : Passed non-file path: /mnt/aa/example.parquet
  • ARROW-5648 - [C++] Avoid using codecvt
  • ARROW-5654 - [C++][Python] Add ChunkedArray::Validate method that checks chunk types for consistency, invoke in Python
  • ARROW-5657 - [C++] "docker-compose run cpp" broken in master
  • ARROW-5674 - [Python] Missing pandas pytest markers from test_parquet.py
  • ARROW-5675 - [Doc] Fix typo in Xcode workflow documentation
  • ARROW-5678 - [R][Lint] Fix hadolint docker linting error
  • ARROW-5693 - [Go] skip IPC integration tests for Decimal128
  • ARROW-5697 - [GLib] Use system pkg-config in c_glib/Dockerfile to correctly find system libraries such as libglib
  • ARROW-5698 - [R] Fix docker-compose build
  • ARROW-5709 - [C++] Fix gandiva-date_time_test failure on Windows
  • ARROW-5714 - [JS] Inconsistent behavior in Int64Builder with/without BigNum
  • ARROW-5723 - [C++][Arrow] Fix crossbow failure
  • ARROW-5728 - [Python] Pin jpype1 version to 0.6.3 due to CI breakage from 0.7.0
  • ARROW-5729 - [Python][Java] ArrowType.Int object has no attribute 'isSigned'
  • ARROW-5730 - [Python][CI] Selectively skip test cases in the dask integration test
  • ARROW-5732 - [C++] macOS builds failing idiosyncratically on master with warnings from pmmintrin.h
  • ARROW-5735 - [C++] Appveyor builds failing persistently in thrift_ep build
  • ARROW-5737 - [Crossbow] Use Python version version 2.7 in the gandiva tasks
  • ARROW-5738 - [Crossbow][Conda] OSX package builds are failing with missing intrinsics
  • ARROW-5739 - [CI] Fix python docker image
  • ARROW-5750 - [Java] Fix java compilation errors
  • ARROW-5754 - [C++] Add override mark for ~GrpcStreamWriter
  • ARROW-5765 - [C++] Fix TestDictionary.Validate in release mode, add docker-compose job for testing C++ release build
  • ARROW-5769 - [Release] Ensure setting up test data in dev/release/00-prepare.sh
  • ARROW-5770 - [C++] Fix -Wpessimizing-move in result.h
  • ARROW-5771 - [Python] Add pytz to conda_env_python.yml to fix python-nopandas build
  • ARROW-5774 - [Java][Documentation] Document the need to checkout git submodules for flight
  • ARROW-5781 - [Archery] Ensure benchmark clone accepts remote in revision
  • ARROW-5791 - [Python] pyarrow.csv.read_csv hangs + eats all RAM
  • ARROW-5816 - [Release] Parallel curl does not work reliably in verify-release-candidate-sh
  • ARROW-5922 - [Python] Unable to connect to HDFS from a worker/data node on a Kerberized cluster using pyarrow' hdfs API
  • PARQUET-1402 - [C++] Parquet files with dictionary page offset as 0 is not readable
  • PARQUET-1405 - Fix writing statistics into DataPageHeader
  • PARQUET-1405 - Fix writing statistics into DataPageHeader
  • PARQUET-1565 - [C++] Add default case to catch all unhandled physical types
  • PARQUET-1571 - [C++] Fix BufferedInputStream when buffer exactly exhausted
  • PARQUET-1574 - [C++] fix parquet-encoding-test
  • PARQUET-1581 - [C++] Fix undefined behavior in encoding.cc
kou
published 0.13.0 •

Changelog

Source

Apache Arrow 0.13.0 (2019-04-01)

Bug Fixes

  • ARROW-295 - [Documentation] Add DOAP file
  • ARROW-1171 - [C++] Segmentation faults on Fedora 24 with pyarrow-manylinux1 and self-compiled turbodbc
  • ARROW-2392 - [C++] Check schema compatibility when writing a RecordBatch
  • ARROW-2399 - [Rust] Builder<T> should not provide a set() method
  • ARROW-2598 - [Python] table.to_pandas segfault
  • ARROW-3086 - [GLib] GISCAN fails due to conda-shipped openblas
  • ARROW-3096 - [Python] Update Python source build instructions given Anaconda/conda-forge toolchain migration
  • ARROW-3133 - [C++] Remove allocation from Binary Boolean Kernels.
  • ARROW-3133 - [C++] Remove allocations from InvertKernel
  • ARROW-3208 - [C++] Fix Cast dictionary to numeric segfault
  • ARROW-3426 - [CI] Java integration test very verbose
  • ARROW-3564 - [C++] Fix dictionary encoding logic for Parquet 2.0
  • ARROW-3578 - [Release] Resolve all hard and symbolic links in tar.gz
  • ARROW-3593 - [R] CI builds failing due to GitHub API rate limits
  • ARROW-3606 - [Crossbow] Fix flake8 crossbow warnings
  • ARROW-3669 - [Python] Raise error on Numpy byte-swapped array
  • ARROW-3843 - [C++][Python] Allow a "degenerate" Parquet file with no columns
  • ARROW-3923 - [Java] JDBC Time Fetches Without Timezone
  • ARROW-4007 - [Java][Plasma] Plasma JNI tests failing
  • ARROW-4050 - [Python][Parquet] core dump on reading parquet file
  • ARROW-4081 - [Go] Sum methods panic when the array is empty
  • ARROW-4104 - [Java] race in AllocationManager during release
  • ARROW-4108 - [Python/Java] Spark integration tests do not work
  • ARROW-4117 - [Python] "asv dev" command fails with latest revision
  • ARROW-4140 - [C++][Gandiva] Compiled LLVM bitcode file path may result in libraries being non-relocatable
  • ARROW-4145 - [C++] Find Windows-compatible strptime implementation
  • ARROW-4181 - [Python] Fixes for Numpy struct array conversion
  • ARROW-4192 - [CI] Fix broken dev/run_docker_compose.sh script
  • ARROW-4213 - [Flight] Fix incompatibilities between C++ and Java
  • ARROW-4244 - [Format] Clarify padding/alignment rationale/recommendation.
  • ARROW-4250 - [C++] adding explicit epsilon for ApproxEquals and corresponding assert macro
  • ARROW-4252 - [C++] Fix missing Status code and newline
  • ARROW-4253 - [GLib] Cannot use non-system Boost specified with $BOOST_ROOT
  • ARROW-4254 - [C++][Gandiva] Build with Boost from Ubuntu Trusty apt
  • ARROW-4255 - [C++] Eagerly initialize name_to_index_ to avoid race
  • ARROW-4261 - [C++] Make CMake paths for IPC, Flight, Thrift, and Plasma subproject compatible
  • ARROW-4264 - [C++] Clarify use of DCHECKs in Kernels
  • ARROW-4267 - [C++/Parquet] Handle duplicate and struct columns in RowGroup reads
  • ARROW-4274 - [C++][Gandiva] split decimal into two parts
  • ARROW-4275 - [C++][Gandiva] Fix slow decimal test
  • ARROW-4280 - Update README.md to reflect parquet deps
  • ARROW-4282 - [Rust] builder benchmark is broken
  • ARROW-4284 - [C#] File / Stream serialization fails due to type mismatch / missing footer
  • ARROW-4295 - [C++][Plasma] Fix incorrect log message
  • ARROW-4296 - [Plasma] Use one mmap file by default, prevent crash with -f
  • ARROW-4308 - [Python] pyarrow has a hard dependency on pandas
  • ARROW-4311 - [Python] Regression on pq.ParquetWriter incorrectly handling source string
  • ARROW-4312 - [C++] Only run 2 * os.cpu_count() clang-format instances at once
  • ARROW-4319 - [C++][Plasma] plasma/store.h pulls in flatbuffer dependency
  • ARROW-4320 - [C++] Add tests for non-contiguous tensors
  • ARROW-4322 - [C++] Don't use _GLIBCXX_USE_CXX11_ABI=0 anymore in docker scripts
  • ARROW-4323 - [Packaging] Fix failing OSX clang conda forge builds
  • ARROW-4326 - [C++] Development instructions in python/development.rst will not work for many Linux distros with new conda-forge toolchain
  • ARROW-4327 - [Python] Add requirements-build.txt convenience file
  • ARROW-4328 - Add a ARROW_USE_OLD_CXXABI configure var to R
  • ARROW-4329 - Python should include the parquet headers
  • ARROW-4342 - [Gandiva][Java] Ignore flaky test.
  • ARROW-4347 - [CI][Python] Also run Python builds when Java affected.
  • ARROW-4349 - [C++] Add static linking option for benchmarks, fix Windows benchmark build failures
  • ARROW-4351 - [C++] Fix CMake errors when neither building shared libraries nor tests
  • ARROW-4355 - [C++] Reorder testing code into src/arrow/testing
  • ARROW-4360 - [C++] Query homebrew for Thrift
  • ARROW-4364 - [C++] Fix CHECKIN warnings
  • ARROW-4366 - [Docs] Change extension from format/README.md to format/README.rst
  • ARROW-4367 - [C++] StringDictionaryBuilder segfaults on Finish with only null entries
  • ARROW-4368 - [Docs] Fix install document for Ubuntu 16.04 or earlier
  • ARROW-4370 - [Python][Bool] to pandas
  • ARROW-4374 - [C++] DictionaryBuilder does not correctly report length and null_count
  • ARROW-4381 - [CI] Update linter container build instructions
  • ARROW-4382 - [C++] Improve new cpplint output readability
  • ARROW-4384 - [C++] Running "format" target on new Windows 10 install opens "how do you want to open this file" dialog
  • ARROW-4385 - [Packaging] Fix PyArrow version update pattern on release
  • ARROW-4389 - [R] Don't install clang-tools in test job
  • ARROW-4395 - [JS] Fix ts-node error running bin/arrow2csv
  • ARROW-4400 - [CI] Switch to https repo for llvm
  • ARROW-4403 - [Rust] Fix format errors
  • ARROW-4404 - [CI] AppVeyor toolchain build does not build anything
  • ARROW-4407 - [C++] Cache compiler for CMake external projects
  • ARROW-4410 - [C++] Fix edge cases in InvertKernel
  • ARROW-4413 - [Python] Fix pa.hdfs.connect() on Python 2
  • ARROW-4414 - [C++] Stop using cmake COMMAND_EXPAND_LISTS because it breaks package builds for older distros
  • ARROW-4417 - [C++] Fix doxygen build
  • ARROW-4420 - [INTEGRATION] Make spark integration test pass and test against spark's master branch
  • ARROW-4421 - [C++][Flight] Handle large RPC messages in Flight
  • ARROW-4434 - [Python] Allow creating trivial StructArray
  • ARROW-4440 - [C++] Revert recent changes to flatbuffers EP causing flakiness
  • ARROW-4457 - [Python] Allow creating Decimal array from Python ints
  • ARROW-4469 - [CI] Pin conda-forge binutils version to 2.31 for now
  • ARROW-4471 - [C++] Pass AR and RANLIB to all external projects
  • ARROW-4474 - Use signed integers in FlightInfo payload size fields
  • ARROW-4480 - [Python] Drive letter removed when writing parquet file
  • ARROW-4487 - [C++] Appveyor toolchain build does not actually build the project
  • ARROW-4494 - [Java] arrow-jdbc JAR is not uploaded on release
  • ARROW-4496 - [Python] Pin to gfortran<4
  • ARROW-4498 - [Plasma] Fix building Plasma with CUDA enabled
  • ARROW-4500 - [C++] Remove pthread / librt hacks causing linking issues in some Linux environments
  • ARROW-4501 - Fix out-of-bounds read in DoubleCrcHash
  • ARROW-4525 - [Rust][Parquet] Enable conversion of ArrowError to ParquetError
  • ARROW-4527 - [Packaging][Linux] Use LLVM 7
  • ARROW-4532 - [Java] fix bug causing very large varchar value buffers
  • ARROW-4533 - [Python] Document how to run hypothesis tests
  • ARROW-4535 - [C++] Fix MakeBuilder to preserve ListType's field name
  • ARROW-4536 - [GLib] Add data_type argument in garrow_list_array_new
  • ARROW-4538 - [Python] Remove index column from subschema in write_to_dataframe
  • ARROW-4549 - [C++] Can't build benchmark code on CUDA enabled build
  • ARROW-4550 - [JS] Fix AMD pattern
  • ARROW-4559 - [Python] Allow Parquet files with special characters in their names
  • ARROW-4563 - [Python] Validate decimal128() precision input
  • ARROW-4571 - [Format] Tensor.fbs file has multiple root_type declarations
  • ARROW-4573 - [Python] Add Flight unit tests
  • ARROW-4576 - [Python] Fix error during benchmarks
  • ARROW-4577 - [C++] Don't set interface link libs on arrow_shared where there are none
  • ARROW-4581 - [C++] Do not require googletest_ep or gbenchmark_ep for library targets
  • ARROW-4582 - [Python/C++] Acquire the GIL on Py_INCREF
  • ARROW-4584 - [Python] Add built wheel to manylinux1 dockerignore
  • ARROW-4585 - [C++] Add protoc dependency to flight_testing
  • ARROW-4587 - [C++] Fix segfaults around DoPut implementation
  • ARROW-4597 - [C++] Targets for system Google Mock shared library are missing
  • ARROW-4601 - [Python] Add license header to dockerignore
  • ARROW-4606 - [Rust] [DataFusion] FilterRelation created RecordBatch with empty schema
  • ARROW-4608 - [C++] cmake script assumes that double-conversion installs static libs
  • ARROW-4617 - [C++] Support double-conversion<3.1
  • ARROW-4624 - [C++] Fix building benchmarks
  • ARROW-4629 - [Python] Pandas arrow conversion slowed down by imports
  • ARROW-4635 - [Java] allocateNew to use last capacity
  • ARROW-4639 - [CI] Switch off GFLAGS_SHARED for osx
  • ARROW-4641 - [C++][Flight] Suppress strict aliasing warnings from "unsafe" casts in client.cc
  • ARROW-4642 - [R] change f to file in read_parquet_file()
  • ARROW-4653 - [C++] Fix bug in decimal multiply
  • ARROW-4654 - [C++] Explicit flight.cc source dependencies
  • ARROW-4657 - Don't build benchmarks in release verify script
  • ARROW-4658 - [C++] Shared gflags is also a run-time conda requirement
  • ARROW-4659 - [CI] ubuntu/debian nightlies fail because of missing gandiva files
  • ARROW-4660 - [C++] Use set_target_properties for defining GFLAGS_IS_A_DLL
  • ARROW-4664 - [C++] Do not execute expressions inside DCHECK macros in release builds
  • ARROW-4669 - [Java] Add validity checks to slice
  • ARROW-4672 - [CI] Fix clang-7 build entry
  • ARROW-4680 - [CI][Rust] Travis CI builds fail with latest Rust 1.34.0…
  • ARROW-4684 - [Python] CI failures in test_cython.py
  • ARROW-4687 - [Python] Stop Flight server on incoming signals
  • ARROW-4688 - [C++][Parquet] Chunk binary column reads at 2^31 - 1 byte boundaries to avoid splitting chunk inside nested string cell
  • ARROW-4696 - Better CUDA detection in release verification script
  • ARROW-4699 - [C++] remove json chunker's requirement of null terminated buffers
  • ARROW-4704 - [GLib][CI] Ensure killing plasma_store_server
  • ARROW-4710 - [C++][R] New linting script skip files with "cpp" extension
  • ARROW-4712 - [C++][CI] fix build (sum.cc) has warnings in clang
  • ARROW-4721 - [Rust][DataFusion] Propagate schema in filter
  • ARROW-4724 - [C++][CI] Enable Python build and test in MinGW build
  • ARROW-4728 - [JS] Fix Table#assign when passed zero-length RecordBatches
  • ARROW-4737 - run C# tests in CI
  • ARROW-4744 - [C++][CI] Change mingw builds back to debug. Cleanup up some version warnings
  • ARROW-4750 - [C++] RapidJSON triggers Wclass-memaccess on GCC 8+
  • ARROW-4760 - [C++] protobuf 3.7 defines EXPECT_OK that clashes with Arrow's macro
  • ARROW-4766 - [C++] Fix empty array cast segfault
  • ARROW-4767 - [C#] ArrowStreamReader crashes while reading the end of a stream
  • ARROW-4768 - [C++][CI] Don't run flaky tests in MinGW build
  • ARROW-4774 - [C++] Fix FileWriter::WriteTable segfault
  • ARROW-4775 - [Site] Site navbar cannot be expanded
  • ARROW-4783 - [C++][CI] Disable arrow thread-pool test on mingw to avoid appveyor timeouts
  • ARROW-4793 - [Ruby] Suppress unused variable warning
  • ARROW-4796 - [Flight/Python] Keep underlying Python object alive in FlightServerBase.do_get
  • ARROW-4802 - [Python] Follow symlinks when deriving Hadoop classpath for HDFS
  • ARROW-4807 - [Rust] Fix csv_writer benchmark
  • ARROW-4811 - [C++] Fix misbehaving CMake dependency on flight_grpc_gen
  • ARROW-4813 - [Ruby] Add tests for == and !=
  • ARROW-4820 - [Python] hadoop class path derived not correct
  • ARROW-4822 - [C++/Python] Check for None on calls to equals
  • ARROW-4828 - [Python] manylinux1 docker-compose context should be python/manylinux1
  • ARROW-4850 - [CI] Ensure integration_test.py returns non-zero on failures
  • ARROW-4853 - [Rust] Array slice doesn't work on ListArray and StructArray
  • ARROW-4857 - [C++/Python/CI] docker-compose in manylinux1 crossbow jobs too old
  • ARROW-4866 - [C++] Fix zstd_ep build for Debug, static CRT builds. Add separate CMake variable for propagating compiler toolchain to ExternalProjects
  • ARROW-4867 - [Python] Respect ordering of columns argument passed to Table.from_pandas
  • ARROW-4869 - [C++] Fix gmock usage in compute/kernels/util-internal-test.cc
  • ARROW-4870 - [Ruby] Fix mys2_mingw_dependencies
  • ARROW-4871 - [Java/Flight] Handle large Flight messages
  • ARROW-4872 - [Python] Keep backward compatibility for ParquetDatasetPiece
  • ARROW-4879 - [C++] cmake can't use conda's flatbuffers
  • ARROW-4881 - [C++] remove references to ARROW_BUILD_TOOLCHAIN
  • ARROW-4900 - [C++] polyfill __cpuidex on mingw-w64
  • ARROW-4903 - [C++] Fix static/shared-only builds
  • ARROW-4906 - [Format] Write about SparseMatrixIndexCSR format is sorted
  • ARROW-4918 - [C++] Add cmake-format to pre-commit
  • ARROW-4928 - [Python] Fix Hypothesis test failures
  • ARROW-4931 - [C++] CMake fails on gRPC ExternalProject
  • ARROW-4938 - [Glib] Undefined symbols error occurred when GIR file is being generated.
  • ARROW-4942 - [Ruby] Remove needless omits in tests
  • ARROW-4948 - [JS] Nightly test failure
  • ARROW-4950 - [C++] Fix CMake 3.2 build
  • ARROW-4952 - [C++] Floating-point comparisons should consider NaNs unequal
  • ARROW-4953 - [Ruby] Not loading libarrow-glib
  • ARROW-4954 - [Python] Fix test failure with Flight enabled
  • ARROW-4958 - [C++] Parquet benchmarks depend on its static test libs
  • ARROW-4961 - [C++] Add documentation note that GTest_SOURCE=BUNDLED is current required on Windows
  • ARROW-4962 - [C++] Warning level to CHECKIN can't compile on modern GCC
  • ARROW-4976 - [JS] Invalidate RecordBatchReader node/dom streams on reset()
  • ARROW-4982 - [GLib][CI] Run tests on AppVeyor
  • ARROW-4984 - Check if Flight gRPC server starts properly
  • ARROW-4986 - [CI] Travis fails to install llvm@7
  • ARROW-4989 - [C++] Find re2 on Ubuntu if asked to
  • ARROW-4991 - [CI] Bump travis node version to 11.12
  • ARROW-4997 - [C#] ArrowStreamReader doesn't consume whole stream and doesn't implement sync read.
  • ARROW-5009 - [C++] Remove using std::.* where I could find them
  • ARROW-5010 - [Release] Fix source release docker
  • ARROW-5012 - [C++] Install testing headers
  • ARROW-5023 - [Release] Fix default value syntax in 02-source.sh
  • ARROW-5024 - [Release] Fix missing variable with --arrow-version
  • ARROW-5025 - [Python][Packaging] Fix gandiva.dll detection
  • ARROW-5026 - [Python][Packaging] Fix gandiva.dll detection on non Windows
  • ARROW-5029 - [C++] Fix compilation warnings in release mode
  • ARROW-5031 - [Dev] Run CUDA Python tests in release verification script
  • ARROW-5042 - [Release] Use the correct dependency source in verification script
  • ARROW-5043 - [Release][Ruby] Fix dependency error in verification script
  • ARROW-5044 - [Release][Rust] Use stable toolchain for format check in verification script
  • ARROW-5046 - [Release][C++] Exclude fragile Plasma test from verification script
  • ARROW-5047 - [Release] Always set up parquet-testing in verification script
  • ARROW-5048 - [Release][Rust] Set up arrow-testing in verification script
  • ARROW-5050 - [C++] cares_ep should build before grpc_ep
  • ARROW-5087 - [Debian] APT repository no longer contains libarrow-dev
  • ARROW-5658 - [JAVA] Provide ability to resync VectorSchemaRoot if types change
  • PARQUET-1482 - [C++] Add branch to TypedRecordReader::ReadNewPage for …
  • PARQUET-1494 - [C++] Recognize statistics built with UNSIGNED sort order by parquet-mr 1.10.0 onwards
  • PARQUET-1532 - [C++] Fix build error with MinGW

New Features and Improvements

  • ARROW-47 - [C++] Preliminary arrow::Scalar object model
  • ARROW-331 - [Doc] Add statement about Python 2.7 compatibility
  • ARROW-549 - [C++] Add arrow::Concatenate function to combine multiple arrays into a single Array
  • ARROW-572 - [C++] Apply visitor pattern in IPC metadata
  • ARROW-585 - [C++] Experimental public API for user-defined extension types and arrays
  • ARROW-694 - [C++] Initial parser interface for reading JSON into RecordBatches
  • ARROW-1425 - [Python][Documentation] Examples of convert Timestamps to/from pandas via arrow
  • ARROW-1572 - [C++] Implement "value counts" kernels for tabulating value frequencies
  • ARROW-1639 - [Python] Serialize RangeIndex as metadata via Table.from_pandas instead of converting to a column of integers
  • ARROW-1642 - [GLib] Build GLib using Meson in Appveyor
  • ARROW-1807 - [JAVA] Reduce Heap Usage (Phase 3): consolidate buffers
  • ARROW-1896 - [C++] Do not allocate memory inside CastKernel. Clean up template instantiation to not generate dead identity cast code
  • ARROW-2015 - [Java] Replace Joda time with Java 8 time
  • ARROW-2022 - [Format] Add metadata to message
  • ARROW-2112 - [C++] Enable cpplint to be run on Windows
  • ARROW-2243 - [C++] Enable IPO/LTO
  • ARROW-2409 - [Rust] Deny warnings in CI.
  • ARROW-2460 - [Rust] Schema and DataType::Struct should use Vec<Rc<Field>>
  • ARROW-2487 - [C++] Provide a variant of AppendValues that takes bytemaps for the nullability
  • ARROW-2523 - [Rust] Implement CAST operations for arrays
  • ARROW-2620 - [Rust] Integrate memory pool abstraction with rest of codebase
  • ARROW-2627 - [Python] Add option to pass memory_map argument to ParquetDataset
  • ARROW-2904 - [C++] Use FirstTimeBitmapWriter instead of SetBit functions in builder.h/cc
  • ARROW-3066 - [Wiki] Add "How to contribute" to developer wiki
  • ARROW-3084 - [Python] Do we need to build both unicode variants of pyarrow wheels?
  • ARROW-3107 - [C++] arrow::PrettyPrint for Column instances
  • ARROW-3121 - [C++] Mean aggregate kernel
  • ARROW-3123 - [C++] Implement Count aggregate kernel
  • ARROW-3135 - [C++] Add helper functions for validity bitmap propagation in kernel context
  • ARROW-3149 - [C++] Use gRPC (when it exists) from conda-forge for CI builds
  • ARROW-3162 - [Python][Flight] Enable implementing Flight servers in Python
  • ARROW-3162 - Flight Python bindings
  • ARROW-3239 - [C++] Implement simple random array generation
  • ARROW-3255 - [C++/Python] Migrate Travis CI jobs off Xcode 6.4
  • ARROW-3289 - [C++] Implement Flight DoPut
  • ARROW-3292 - [C++] Test Flight RPC in Travis CI
  • ARROW-3295 - [Packaging] Package gRPC libraries in conda-forge for use in builds, packaging
  • ARROW-3297 - [Python] Python bindings for Flight C++ client
  • ARROW-3311 - [R] Functions for deserializing IPC components from arrow::Buffer or from IO interface
  • ARROW-3328 - [Flight] Allow for optional unique flight identifier to be sent with FlightGetInfo
  • ARROW-3361 - [R] Also run cpplint on Rcpp source files
  • ARROW-3364 - [Docs] Add docker-compose integration documentation
  • ARROW-3367 - [INTEGRATION] Port Spark integration test to the docker-compose setup
  • ARROW-3422 - [C++] Uniformly add ExternalProject builds to the "toolchain" target. Fix gRPC EP build on Linux
  • ARROW-3434 - [Packaging] Add Apache ORC C++ library to conda-forge
  • ARROW-3435 - [C++] Add option to use dynamic linking with re2
  • ARROW-3511 - [Gandiva] Link filter and project operations
  • ARROW-3532 - [Python] Emit warning when looking up for duplicate struct or schema fields
  • ARROW-3550 - [C++] use kUnknownNullCount for the default null_count argument
  • ARROW-3554 - [C++] Reverse traits for C++
  • ARROW-3594 - [Packaging] Build "cares" library in conda-forge
  • ARROW-3595 - [Packaging] Build boringssl in conda-forge
  • ARROW-3596 - [Packaging] Build gRPC in conda-forge
  • ARROW-3619 - [R] Expose global thread pool optins
  • ARROW-3631 - [C#] Add Appveyor configuration
  • ARROW-3653 - [C++][Python] Support data copying between different GPU devices
  • ARROW-3735 - [Python] Add test for calling cast() with None
  • ARROW-3761 - [R] Bindings for CompressedInputStream, CompressedOutputStream
  • ARROW-3763 - [C++] Write Parquet ByteArray / FixedLenByteArray reader batches directly into arrow::BinaryBuilder
  • ARROW-3769 - [C++] Add support for reading non-dictionary encoded binary Parquet columns directly as DictionaryArray
  • ARROW-3770 - [C++] Validate schema for each table written with parquet::arrow::FileWriter
  • ARROW-3816 - [R] nrow.RecordBatch method
  • ARROW-3824 - [R] Add basic build and test documentation
  • ARROW-3838 - [Rust] CSV Writer
  • ARROW-3846 - [Gandiva][C++] Build Gandiva C++ libraries and get unit tests passing on Windows
  • ARROW-3882 - [Rust] Cast Kernel for most types
  • ARROW-3903 - [Python] Random array generator for Arrow conversion and Parquet testing
  • ARROW-3926 - [Python] Add Gandiva bindings to Python manylinux1 wheels
  • ARROW-3951 - [Go] implement a CSV writer
  • ARROW-3954 - [Rust] Add Slice to Array and ArrayData
  • ARROW-3965 - [Java] JDBC-To-Arrow Configuration
  • ARROW-3966 - [Java] JDBC Column Metadata in Arrow Field Metadata
  • ARROW-3972 - [C++] Migrate to LLVM 7. Add option to disable using ld.gold
  • ARROW-3981 - [C++] Rename json.h
  • ARROW-3985 - [C++] Let ccache preserve comments
  • ARROW-4012 - [Website] Add documentation how to install Apache Arrow on MSYS2
  • ARROW-4014 - [C++] Fix "LIBCMT" warnings on MSVC
  • ARROW-4023 - [Gandiva] Address long CI times in macOS builds
  • ARROW-4024 - [Python] Raise minimal Cython version to 0.29
  • ARROW-4031 - [C++] Refactor bitmap building
  • ARROW-4040 - [Rust] Add array_ops method for filtering an array
  • ARROW-4056 - [C++] Unpin boost-cpp in conda_env_cpp.yml
  • ARROW-4061 - [Rust][Parquet] Implement spaced version for non-diction…
  • ARROW-4068 - [Gandiva] Support building with Xcode 6.4
  • ARROW-4071 - [Rust] Add rustfmt as a pre-commit hook
  • ARROW-4072 - [Rust] Set default value for PARQUET_TEST_DATA
  • ARROW-4092 - [Rust] Implement common Reader / DataSource trait for CSV and Parquet
  • ARROW-4094 - [Python] Store RangeIndex in Parquet files as metadata rather than a physical data column
  • ARROW-4110 - [C++] Do not generate distinct cast kernels when input and output type are the same
  • ARROW-4123 - [C++] Enable linting tools to be run on Windows
  • ARROW-4124 - [C++] Draft Aggregate and Sum kernels
  • ARROW-4142 - [Java] JDBC Array -> Arrow ListVector
  • ARROW-4165 - [C++] Port cpp/apidoc/Windows.md and other files to Sphinx / rst
  • ARROW-4180 - [Java] Make CI tests use logback.xml
  • ARROW-4196 - [Rust] Add explicit SIMD vectorization for arithmetic ops in "array_ops"
  • ARROW-4198 - [Gandiva] Added support to cast timestamp
  • ARROW-4204 - [Gandiva] add support for decimal subtract
  • ARROW-4205 - [Gandiva] Support for decimal multiply
  • ARROW-4206 - [Gandiva] support decimal divide and mod
  • ARROW-4212 - [C++][Python] CudaBuffer view of arbitrary device memory object
  • ARROW-4230 - [C++] Fix Flight builds with gRPC/Protobuf/c-ares
  • ARROW-4232 - [C++] Follow conda-forge compiler ABI migration
  • ARROW-4234 - [C++] Improve memory bandwidth test
  • ARROW-4235 - [GLib] Use "column_builder" in GArrowRecordBatchBuilder
  • ARROW-4236 - [java] Distinct plasma client create exceptions
  • ARROW-4245 - [Rust] Add Rustdoc header to source files
  • ARROW-4247 - [Packaging] Update verify script for 0.12.0
  • ARROW-4251 - [C++][Release] Add option to set ARROW_BOOST_VENDORED environment variable in verify-release-candidate.sh
  • ARROW-4262 - [Website] Preview to Spark with Arrow and R improvements
  • ARROW-4263 - [Rust] Donate DataFusion
  • ARROW-4265 - [C++] Automatic conversion between Table and std::vector<std::tuple<..>>
  • ARROW-4268 - [C++] Native C type TypeTraits
  • ARROW-4271 - [Rust] Move Parquet specific info to Parquet Readme
  • ARROW-4273 - [Release] Fix verification script to use cf201901 conda-forge label
  • ARROW-4277 - [C++] Add gmock to the toolchain
  • ARROW-4281 - [CI] Use Ubuntu Xenial VMs on Travis-CI
  • ARROW-4285 - [Python] Use proper builder interface for serialization
  • ARROW-4287 - [C++] Ensure minimal bison version on OSX for Thrift
  • ARROW-4289 - [C++] Forward AR and RANLIB to thirdparty builds
  • ARROW-4290 - [C++/Gandiva] Support detecting correct LLVM version in Homebrew
  • ARROW-4291 - [Dev] Support selecting features in release verification scripts
  • ARROW-4294 - [C++][Plasma] Add support for evicting Plasma objects to external store
  • ARROW-4297 - [C++] Fix build error with MinGW-w64 32-bit
  • ARROW-4298 - [Java] Add javax.annotation-api dependency for JDK >= 9
  • ARROW-4299 - [Ruby] Depend on the same version as Red Arrow
  • ARROW-4300 - [C++] Restore apache-arrow Homebrew recipe and define process for maintaining and updating for releases
  • ARROW-4303 - [Gandiva/Python] Build LLVM with RTTI in manylinux1 container
  • ARROW-4305 - [Rust] Fix parquet version number in README
  • ARROW-4307 - [C++] Fix Doxygen warnings
  • ARROW-4310 - [Website] Update install document for 0.12.0
  • ARROW-4313 - Define general benchmark database schema
  • ARROW-4315 - [Website] Add Go and Rust to list of supported languages
  • ARROW-4318 - [C++] Add Tensor::CountNonZero
  • ARROW-4321 - [CI] Setup conda-forge channel globally in docker containers
  • ARROW-4330 - [C++] More robust discovery of pthreads
  • ARROW-4331 - [C++] Extend Scalar Datum to support more types
  • ARROW-4332 - [Website] Improve documentation for publishing site
  • ARROW-4334 - [CI] Setup conda-forge channel globally in travis builds
  • ARROW-4335 - [C++] Better document sparse tensor support
  • ARROW-4336 - [C++] Change default build type to RELEASE
  • ARROW-4339 - [C++][Python] Developer documentation overhaul for 0.13 release
  • ARROW-4340 - [C++][CI] Build IWYU for LLVM 7 in iwyu docker-compose job
  • ARROW-4341 - [C++] Refactor Primitive builders and BooleanBuilder to use TypedBufferBuilder<T>
  • ARROW-4344 - [Java] Further cleanup mvn output, upgrade rat plugin
  • ARROW-4345 - [C++] Add Apache 2.0 license file to the Parquet-testing repository
  • ARROW-4346 - [C++] Fix class-memaccess warning on gcc 8.x
  • ARROW-4352 - [C++] Add support for system Google Test
  • ARROW-4353 - [CI] Add MinGW builds
  • ARROW-4358 - [CI] Restore support for trusty in CI
  • ARROW-4361 - [Website] Update commiters list
  • ARROW-4362 - [Java] Test OpenJDK 11 in CI
  • ARROW-4363 - [CI][C++] Add CMake format checks
  • ARROW-4372 - [C++] Embed precompiled bitcode in the gandiva library
  • ARROW-4373 - [Packaging] Travis fails to deploy conda packages on OSX
  • ARROW-4375 - [CI] Sphinx dependencies were removed from docs conda environment
  • ARROW-4376 - [Rust] Implement from_buf_reader for csv::Reader
  • ARROW-4377 - [Rust] Implement std::fmt::Debug for PrimitiveArrays
  • ARROW-4379 - [Python] Register serializers for collections.Counter and collections.deque.
  • ARROW-4383 - [C++] Use the CMake's standard find features
  • ARROW-4386 - [Rust] Temporal array support
  • ARROW-4388 - [Go] add DimNames() method to tensor Interface
  • ARROW-4393 - [Rust] coding style: apply 90 characters per line limit
  • ARROW-4396 - [JS] Update Typedoc for TypeScript 3.2
  • ARROW-4397 - [C++] Add dim_names in Tensor and SparseTensor
  • ARROW-4399 - [C++] Do not use extern template class with NumericArray<T> and NumericTensor<T>
  • ARROW-4401 - [Python] Alpine dockerfile fails to build because pandas requires numpy as build dependency
  • ARROW-4406 - [Python] Exclude HDFS directories in S3 from ParquetManifest
  • ARROW-4408 - [CPP/Doc] Remove outdated Parquet documentation
  • ARROW-4422 - [Plasma] Enforce memory limit in plasma, rather than relying on dlmalloc_set_footprint_limit
  • ARROW-4423 - [C++] Upgrade vendored gmock/gtest to 1.8.1
  • ARROW-4424 - [Python] Install tensorflow and keras-preprocessing in manylinux1 container
  • ARROW-4425 - Add link to 'Contributing' page in the top-level Arrow README
  • ARROW-4430 - [C++] Fix untested TypedByteBuffer<T>::Append method
  • ARROW-4431 - [C++] Fixes for gRPC vendored builds
  • ARROW-4435 - Minor fixups to csharp .sln and .csproj file
  • ARROW-4436 - [Documentation] Update building.rst to reflect pyarrow req
  • ARROW-4442 - [JS] Add explicit type annotation to Chunked typeId getter
  • ARROW-4444 - [Testing] Add DataFusion test files to arrow-testing repo
  • ARROW-4445 - [C++][Gandiva] Run Gandiva-LLVM tests in Appveyor
  • ARROW-4446 - [C++][Python] Run Gandiva C++ unit tests in Appveyor, get build and tests working in Python
  • ARROW-4448 - [Java][Flight] Disable flaky TestBackPressure
  • ARROW-4449 - [Rust] Convert File to T: Read + Seek for schema inference
  • ARROW-4454 - [C++] fix unused parameter warnings
  • ARROW-4455 - [Plasma] Suppress class-memaccess warnings
  • ARROW-4459 - [Testing] Add arrow-testing repo as submodule
  • ARROW-4460 - [Website] DataFusion Blog Post
  • ARROW-4461 - [C++] Expose bit map operations that work with raw pointers
  • ARROW-4462 - [C++] Upgrade LZ4 v1.7.5 to v1.8.3 to compile with VS2017
  • ARROW-4464 - [Rust][DataFusion] Add support for LIMIT
  • ARROW-4466 - [Rust][DataFusion] Add support for Parquet data source
  • ARROW-4468 - [Rust] Implement BitAnd/BitOr for &Buffer (with SIMD) (#3571)
  • ARROW-4472 - [Website][Python] Blog post about string memory use work in Arrow 0.12
  • ARROW-4475 - [Python] Fix recursive serialization of self-containing objects
  • ARROW-4476 - [Rust][DataFusion] Update README to cover DataFusion and new testing git submodule
  • ARROW-4481 - [Website] Remove generated specification docs from site after docs migration
  • ARROW-4483 - [Website] Add myself to contributors.yaml to fix broken link in blog post
  • ARROW-4485 - [CI] Determine maintenance approach to pinned conda-forge binutils package
  • ARROW-4486 - [Python][CUDA] Add base argument to foreign_buffer
  • ARROW-4488 - [Rust][u8] > for Buffer does not ensure correct padding
  • ARROW-4489 - [Rust] PrimitiveArray.value_slice performs bounds checking when it should not
  • ARROW-4490 - [Rust] Add explicit SIMD vectorization for boolean ops in "array_ops"
  • ARROW-4491 - [Python] Use StringConverter and stringstream instead of std::stoi and std::to_string
  • ARROW-4499 - [CI] Unpin flake8 in lint script, fix warnings in dev/
  • ARROW-4502 - [C#] Add support for zero-copy reads
  • ARROW-4506 - [Ruby] Add Arrow::RecordBatch#raw_records
  • ARROW-4513 - [Rust] Implement BitAnd/BitOr for &Bitmap
  • ARROW-4517 - [JS] remove version number as it is not used
  • ARROW-4518 - [JS] add jsdelivr to package.json
  • ARROW-4528 - [C++] Update lint docker container to LLVM-7
  • ARROW-4529 - [C++] Add test for BitUtil::RoundDown
  • ARROW-4531 - [C++] Support slices for SumKernel
  • ARROW-4537 - [CI] Suppress shell warning on travis-ci
  • ARROW-4539 - [Java] Fix child vector count for lists. (#3625)
  • ARROW-4540 - [Rust] Basic JSON reader
  • ARROW-4543 - [C#] Update Flat Buffers code to latest version
  • ARROW-4546 - Update LICENSE.txt with parquet-cpp licenses
  • ARROW-4547 - [Python][Documentation] Update python/development.rst with instructions for CUDA-enabled builds
  • ARROW-4556 - [Rust] Preserve JSON field order when inferring schema
  • ARROW-4558 - [C++][Flight] Implement gRPC customizations without UB
  • ARROW-4560 - [R] array() needs to take single input, not ...
  • ARROW-4562 - [C++] Avoid copies when serializing Flight data
  • ARROW-4564 - [C++] IWYU docker image silently fails
  • ARROW-4565 - [R] Fix decimal record batches with no null values
  • ARROW-4568 - [C++] Add version macros to headers
  • ARROW-4572 - [C++] Remove memory zeroing from PrimitiveAllocatingUnaryKernel
  • ARROW-4583 - [Plasma] Fix some small bugs reported by code scan tool
  • ARROW-4586 - [Rust] Remove arrow/mod.rs as it is not needed
  • ARROW-4589 - [Rust] Projection push down query optimizer rule
  • ARROW-4590 - [Rust] Add explicit SIMD vectorization for comparison ops in "array_ops"
  • ARROW-4592 - [GLib] Stop configure immediately when GLib isn't available
  • ARROW-4593 - [Ruby][out_of_range] returns nil
  • ARROW-4594 - [Ruby] returns Arrow::Struct instead of Arrow::Array
  • ARROW-4595 - [Rust] Implement Table API (a.k.a DataFrame)
  • ARROW-4598 - [CI] Remove needless LLVM_DIR for macOS
  • ARROW-4599 - [C++] Add support for system GFlags
  • ARROW-4602 - [Rust][DataFusion] Integrate query optimizer with ExecutionContext
  • ARROW-4603 - [Rust] [DataFusion] Execution context should allow in-memory data sources to be registered
  • ARROW-4604 - [Rust] [DataFusion] Add benchmarks for SQL query execution
  • ARROW-4605 - [Rust] Move filter and limit code from DataFusion into compute module
  • ARROW-4609 - [C++] Use google benchmark from toolchain
  • ARROW-4610 - [Plasma] Avoid Crash in Plasma Java Client
  • ARROW-4611 - [C++] Rework CMake logic
  • ARROW-4612 - [Python] Use cython from PyPI for windows wheels build
  • ARROW-4613 - [C++] Set CMAKE_INSTALL_LIBDIR in gtest thirdparty build
  • ARROW-4614 - [C++/CI] Activate flight build in ci/docker_build_cpp.sh
  • ARROW-4615 - [C++] Add checked_pointer_cast
  • ARROW-4616 - [C++] Log message in BuildUtils as STATUS
  • ARROW-4618 - [Docker] Makefile to build dependent docker images
  • ARROW-4619 - [R] Fix the autobrew script
  • ARROW-4620 - [C#] Add unit tests for "Types" in arrow/csharp
  • ARROW-4623 - [R] update Rcpp version
  • ARROW-4628 - [Rust][DataFusion] Implement type coercion query optimizer rule
  • ARROW-4632 - [Ruby] Add BigDecimal#to_arrow
  • ARROW-4634 - [Rust][Parquet] Reorganize test_common
  • ARROW-4637 - [Python] Conditionally import pandas symbols if they are used. Do not require pandas as a test dependency
  • ARROW-4638 - [R] install instructions using brew
  • ARROW-4640 - [Python] Add docker-compose configuration to build and test the project without pandas installed
  • ARROW-4643 - [C++] Force compiler diagnostic colors
  • ARROW-4644 - [C++/Docker] Build Gandiva in the docker containers
  • ARROW-4645 - [C++/Packaging] Ship Gandiva with OSX and Windows wheels
  • ARROW-4646 - [C++/Packaging] Ship gandiva with the conda-forge packages
  • ARROW-4655 - [Packaging] Parallelize binary upload
  • ARROW-4662 - [Python] Add support of type_codes in UnionType
  • ARROW-4667 - [C++] Suppress unused function warnings with MinGW
  • ARROW-4670 - [Rust] array_ops::sum performance optimizations
  • ARROW-4671 - [C++] MakeBuilder doesn't support Type::DICTIONARY
  • ARROW-4673 - [C++] Implement Scalar::Equals and Datum::Equals
  • ARROW-4676 - [C++] Add support for debug build with MinGW
  • ARROW-4678 - [Rust] Minimize unstable feature usage
  • ARROW-4679 - [Rust] Implement in-memory data source for DataFusion
  • ARROW-4681 - [Rust][DataFusion] Partition aware data sources
  • ARROW-4686 - [Dev] Only accept 'y' or 'n' in merge_arrow_pr.py prompts
  • ARROW-4689 - [Go] Add support for wasm
  • ARROW-4690 - Building TensorFlow compatible wheels for Arrow
  • ARROW-4692 - [Flight] Explain sidecar in a bit more detail
  • ARROW-4693 - [CI] Build boost with multiprecision
  • ARROW-4697 - [C++] Add URI parsing facility
  • ARROW-4703 - [C++] Upgrade dependency versions
  • ARROW-4705 - [Rust] Improve error handling in csv reader
  • ARROW-4707 - [C++] moving BitsetStack to BitUtil::
  • ARROW-4718 - [C#] Add ArrowStreamReader/Writer ctor with bool leaveOpen
  • ARROW-4727 - [Rust] Add equality check for schemas
  • ARROW-4730 - [C++] Add docker-compose entry for testing Fedora build with system packages
  • ARROW-4731 - [C++] Add docker-compose entry for testing Ubuntu Xenial build with system packages
  • ARROW-4732 - [C++] Add docker-compose entry for testing Debian Testing build with system packages
  • ARROW-4733 - [C++] Add CI entry that builds without the conda-forge toolchain but with system packages
  • ARROW-4734 - [Go] Add option to write a header for CSV writer
  • ARROW-4735 - [Go] Optimize CSV writer CPU/Mem performances
  • ARROW-4739 - [Rust] LogicalPlan can now be passed to threads
  • ARROW-4740 - [Java] Upgrade to JUnit 5.
  • ARROW-4743 - [Java] Add javadoc missing in classes and methods in java…
  • ARROW-4745 - [C++][Documentation] Document notes from replicating Static_Crt_Build on windows
  • ARROW-4749 - [Rust] Return Result for RecordBatch::new()
  • ARROW-4751 - [C++] Add pkg-config to conda_env_cpp.yml now that it's available on Windows
  • ARROW-4754 - [Java] Randomize port and retry binding server when bind fails
  • ARROW-4756 - Update readme for triggering docker builds
  • ARROW-4758 - [C++][Flight] Fix intermittent build failure
  • ARROW-4769 - [Rust] Improve array limit fn where max_records >= len
  • ARROW-4772 - [C++] new ORC adapter interface for stripe and row iteration
  • ARROW-4776 - [C++] Add DictionaryBuilder constructor which takes a dictionary array
  • ARROW-4777 - [C++/Python] manylinux1: Update lz4 to 1.8.3
  • ARROW-4778 - [C++/Python] manylinux1: Update Thrift to 0.12.0
  • ARROW-4782 - [C++] Prototype array and scalar expression types to help with building an deferred compute graph
  • ARROW-4786 - [C++/Python] Support better parallelisation in manylinux1 base build
  • ARROW-4789 - [C++] Deprecate and and later remove arrow::io::ReadableFileInterface
  • ARROW-4790 - [Python/Packaging] Update manylinux docker image in crossbow task
  • ARROW-4791 - [Rust] Remove unused dependencies
  • ARROW-4794 - [Python] Make pandas an optional test dependency
  • ARROW-4797 - [Plasma] Allow client to check store capacity and avoid server crash
  • ARROW-4801 - [GLib] Suppress Meson warnings
  • ARROW-4808 - [Java][Vector] More util methods to set decimal vector.
  • ARROW-4812 - [Rust] [DataFusion] Table.scan() should return one iterator per partition
  • ARROW-4817 - [Rust] [DataFusion] Small re-org of modules
  • ARROW-4818 - [Rust] [DataFusion] Parquet data source does not support null values
  • ARROW-4826 - [Go] export Flush method for CSV writer
  • ARROW-4831 - [C++] CMAKE_AR is not passed to ZSTD thirdparty dependency
  • ARROW-4833 - [Release] Document how to update the brew formula in the release management guide
  • ARROW-4834 - [R] Feature flag when building parquet
  • ARROW-4835 - [GLib] Add boolean operations
  • ARROW-4837 - [C++] Support c++filt on a custom path in the run-test.sh script
  • ARROW-4839 - [C#] Add NuGet package metadata and instructions.
  • ARROW-4843 - [Rust] [DataFusion] Parquet data source should support DATE
  • ARROW-4846 - [Java] Upgrade to jackson 2.9.8
  • ARROW-4849 - [C++] Add docker-compose entry for testing Ubuntu Bionic build with system packages
  • ARROW-4854 - [Rust] Use zero-copy slice for limit kernel
  • ARROW-4855 - [Packaging] Generate default package version based on cpp tags in crossbow.py
  • ARROW-4858 - [Flight/Python] enable FlightDataStream to be implemented in Python
  • ARROW-4859 - [GLib] Add garrow_numeric_array_mean()
  • ARROW-4862 - [C++] Fix gcc warnings in CHECKIN
  • ARROW-4862 - [GLib] Add GArrowCastOptions::allow-invalid-utf8 property
  • ARROW-4865 - [Rust] Support list casts
  • ARROW-4873 - [C++] Clarify documentation about how to use external ARROW_PACKAGE_PREFIX while also using CONDA dependency resolution
  • ARROW-4878 - [C++] Append \Library to CONDA_PREFIX when using ARROW_DEPENDENCY_SOURCE=CONDA
  • ARROW-4882 - [GLib] Add sum functions
  • ARROW-4887 - [GLib] Add garrow_array_count()
  • ARROW-4889 - [C++] Add STATUS messages for Protobuf in CMake
  • ARROW-4891 - [C++] Add zlib headers to include directories
  • ARROW-4892 - [Rust][DataFusion] Move SQL parser and planner into SQL module
  • ARROW-4893 - [C++] conda packages should use inside of conda-build
  • ARROW-4894 - [Rust][DataFusion] Remove all uses of panic! from aggregate.rs
  • ARROW-4895 - [Rust][DataFusion] Move error.rs to root of crate
  • ARROW-4896 - [Rust][DataFusion] Remove all uses of panic! from DataFusion tests
  • ARROW-4897 - [Rust][DataFusion] Improve rustdocs
  • ARROW-4898 - [C++] Old versions of FindProtobuf.cmake use ALL-CAPS for variables
  • ARROW-4899 - [Rust][DataFusion] Remove panic from expression.rs
  • ARROW-4901 - [Go] add AppVeyor CI
  • ARROW-4905 - [C++][Plasma] Remove dlmalloc symbols from client library
  • ARROW-4907 - [CI] Add docker container to inspect docker context
  • ARROW-4908 - [Rust][DataFusion] Add support for date/time parquet types encoded as INT32/INT64
  • ARROW-4909 - [CI] Use hadolint to lint Dockerfiles
  • ARROW-4910 - [Rust][DataFusion] Remove all uses of unimplemented!
  • ARROW-4915 - [GLib][C++] Add arrow::NullBuilder support for GLib
  • ARROW-4922 - [Packaging] Use system libraries for .deb and .rpm
  • ARROW-4924 - [Ruby] Add Decimal128#to_s(scale=nil)
  • ARROW-4925 - [Rust] [DataFusion] Remove duplicate implementations of collect_expr
  • ARROW-4926 - [Rust][DataFusion] Update README for 0.13.0
  • ARROW-4929 - [GLib] Add garrow_array_count_values()
  • ARROW-4932 - [GLib] Use G_DECLARE_DERIVABLE_TYPE macro
  • ARROW-4933 - [R] Autodetect Parquet support using pkg-config
  • ARROW-4937 - [R] Clean pkg-config related logic
  • ARROW-4939 - [Python] Add wrapper for "sum" kernel
  • ARROW-4940 - [Rust] Enable warnings for missing docs, add docs in datafusion
  • ARROW-4944 - [C++] Raise minimal required thrift-cpp to 0.11 in conda environment
  • ARROW-4946 - [C++] Support detection of flatbuffers without FlatbuffersConfig.cmake
  • ARROW-4947 - [Flight/C++] Remove redundant schema parameter to Flight client DoGet
  • ARROW-4951 - [C++] Turn off cpp benchmarks in cpp docker images
  • ARROW-4955 - [GLib] Add garrow_file_is_closed()
  • ARROW-4964 - [Ruby] Add closed check if available on auto close
  • ARROW-4969 - [C++] Set RPATH in correct order for test executables on OSX
  • ARROW-4977 - [Ruby] Add support for building on Windows
  • ARROW-4978 - [Ruby] Fix wrong internal variable name for table data
  • ARROW-4979 - [GLib] Add missing lock to garrow::GIOInputStream
  • ARROW-4980 - [GLib] Use GInputStream as the parent of GArrowInputStream
  • ARROW-4981 - [Ruby] Add support for CSV data encoding conversion
  • ARROW-4983 - [Plasma] Unmap memory upon destruction of the PlasmaClient
  • ARROW-4994 - [Website] Update details for ptgoetz
  • ARROW-4995 - [R] Support for winbuilder for CRAN checks
  • ARROW-4996 - [Plasma] Enable uninstalling of signal handler and fix log_dir
  • ARROW-5003 - [R] remove dependency on withr
  • ARROW-5006 - [R] parquet.cpp does not include enough Rcpp
  • ARROW-5011 - [Release] Add support in source release script for custom git hash
  • ARROW-5013 - [Rust][DataFusion] Refactor runtime expression support
  • ARROW-5014 - [Java] Fix typos in Flight module
  • ARROW-5018 - [Release] Include JavaScript implementation
  • ARROW-5032 - [C++] Install headers in vendored/datetime directory
  • ARROW-5041 - [C++] add GTest_SOURCE=BUNDLED to verify-release-candidate.bat
  • ARROW-5075 - [Release] Add 0.13.0 release note
  • ARROW-5084 - [Website] Blog post / release announcement for 0.13.0
  • PARQUET-1477 - [C++] sync thrift to final crypto spec
  • PARQUET-1508 - [C++] Read ByteArray data directly into arrow::BinaryBuilder and BinaryDictionaryBuilder. Refactor encoders/decoders to use cleaner virtual interfaces
  • PARQUET-1519 - [C++] Hide TypedColumnReader implementation behind virtual interfaces, remove use of "extern template class"
  • PARQUET-1521 - [C++] Use pure virtual interfaces for parquet::TypedColumnWriter, remove use of 'extern template class'
  • PARQUET-1525 - [C++] remove dependency on getopt in parquet tools
kszucs
published 0.4.1 •

Changelog

Source

Apache Arrow 0.4.1 (2017-06-09)

Bug Fixes

  • ARROW-424 - [C++] Make ReadAt, Write HDFS functions threadsafe
  • ARROW-1039 - Python: pyarrow.Filesystem.read_parquet causing error if nthreads>1
  • ARROW-1050 - [C++] Export arrow::ValidateArray
  • ARROW-1051 - [Python] Opt in to Parquet unit tests to avoid accidental suppression of dynamic linking errors
  • ARROW-1056 - [Python] Ignore pandas index in parquet+hdfs test
  • ARROW-1057 - Fix cmake warning and msvc debug asserts
  • ARROW-1060 - [Python] Add unit tests for reference counts in memoryview interface
  • ARROW-1062 - [GLib] Follow API changes in examples
  • ARROW-1066 - [Python] pandas 0.20.1 deprecation of pd.lib causes a warning on import
  • ARROW-1070 - [C++] Use physical types for Feather date/time types
  • ARROW-1075 - [GLib] Fix build error on macOS
  • ARROW-1082 - [GLib] Add CI on macOS
  • ARROW-1085 - [java] Follow up on template cleanup. Missing method for …
  • ARROW-1086 - include additional pxd files during package build
  • ARROW-1088 - [Python] Only test unicode filenames if system supports them
  • ARROW-1090 - Improve build_ext usability with --bundle-arrow-cpp
  • ARROW-1091 - Decimal scale and precision are flipped
  • ARROW-1092 - More Decimal and scale flipped follow-up
  • ARROW-1094 - [C++] Always truncate buffer read in ReadableFile::Read if actual number of bytes less than request
  • ARROW-1127 - pyarrow 4.1 import failure on Travis

New Features and Improvements

  • ARROW-897 - [GLib] Extract CI configuration for GLib
  • ARROW-986 - [Format] Add brief explanation of dictionary batches in IPC.md
  • ARROW-990 - [JS] Add tslint support for linting TypeScript
  • ARROW-1020 - [Format] Revise language for Timestamp type in Schema.fbs to avoid possible confusion about tz-naive timestamps
  • ARROW-1034 - [PYTHON] Resolve wheel build issues on Windows
  • ARROW-1049 - [java] vector template cleanup
  • ARROW-1063 - [Website] Updates for 0.4.0 release, release posting
  • ARROW-1068 - [Python] Create external repo with appveyor.yml configured for building Python wheel installers
  • ARROW-1069 - Add instructions for publishing maven artifacts
  • ARROW-1078 - [Python] Account for Apache Parquet shared library consolidation
  • ARROW-1080 - C++: Add tutorial about converting to/from row-wise representation
  • ARROW-1084 - Implementations of BufferAllocator should handle Netty's OutOfDirectMemoryError
  • ARROW-1118 - [Website] Site updates for 0.4.1
xhochy
published 0.4.0 •

Changelog

Source

Apache Arrow 0.4.0 (2017-05-22)

Bug Fixes

  • ARROW-813 - [Python] setup.py sdist must also bundle dependent cmake m…
  • ARROW-824 - Date and Time Vectors should reflect timezone-less semantics
  • ARROW-856 - Also read compiler info from stdout
  • ARROW-909 - Link jemalloc statically if build as external project
  • ARROW-939 - fix division by zero if one of the tensor dimensions is zero
  • ARROW-940 - [JS] Generate multiple artifacts
  • ARROW-944 - Python: Compat broken for pandas==0.18.1
  • ARROW-948 - [GLib] Update C++ header file list
  • ARROW-952 - fix regex include from C++ standard library
  • ARROW-958 - [Python] Fix conda source build instructions
  • ARROW-979 - [Python] Fix setuptools_scm version when release tag is not in the master timeline
  • ARROW-991 - [Python] Create new dtype when deserializing from Arrow to NumPy datetime64
  • ARROW-995 - [Website] Fix a typo
  • ARROW-998 - [Format] Clarify that the IPC file footer contains an additional copy of the schema
  • ARROW-1003 - [C++] Check flag _WIN32 instead of __WIN32
  • ARROW-1004 - [Python] Add conversions for numpy object arrays with integers and floats
  • ARROW-1017 - [Python] Fix memory leaks in conversion to pandas.DataFrame
  • ARROW-1023 - Python: Fix bundling of arrow-cpp for macOS
  • ARROW-1033 - [Python] pytest discovers scripts/test_leak.py
  • ARROW-1045 - [JAVA] Add support for custom metadata in org.apache.arrow.vector.types.pojo.*
  • ARROW-1046 - [Python] Reconcile pandas metadata spec
  • ARROW-1053 - [Python] Remove unnecessary Py_INCREF in PyBuffer causing memory leak
  • ARROW-1054 - [Python] Test suite fails on pandas 0.19.2
  • ARROW-1061 - [C++] Harden decimal parsing against invalid strings
  • ARROW-1064 - ModuleNotFoundError: No module named 'pyarrow._parquet'

New Features and Improvements

  • ARROW-29 - [C++] FindRe2 cmake module
  • ARROW-182 - [C++] Factor out Array::Validate into a separate function
  • ARROW-376 - Python: Convert non-range Pandas indices (optionally) to Arrow
  • ARROW-446 - [Python] Expand Sphinx documentation for 0.3
  • ARROW-482 - [Java] Exposing custom field metadata
  • ARROW-532 - [Python] Expand pyarrow.parquet documentation for 0.3 release
  • ARROW-579 - Python: Provide redistributable pyarrow wheels on OSX
  • ARROW-596 - [Python] Add convenience function to convert pandas.DataFrame to pyarrow.Buffer containing a file or stream representation
  • ARROW-629 - [JS] Add unit test suite
  • ARROW-714 - [C++] Add import_pyarrow C API in the style of NumPy for thirdparty C++ users
  • ARROW-819 - Public Cython and C++ API in the style of lxml, arrow::py::import_pyarrow method
  • ARROW-872 - [JS] Read streaming format
  • ARROW-873 - [JS] Implement fixed width list type
  • ARROW-874 - [JS] Read dictionary-encoded vectors
  • ARROW-881 - [Python] Reconstruct Pandas DataFrame indexes using metadata
  • ARROW-891 - [Python] Expand Windows build instructions to not require looking at separate C++ docs
  • ARROW-899 - [Doc] Add 0.3.0 changelog
  • ARROW-901 - [Python] Add Parquet unit test for fixed size binary
  • ARROW-913 - [Python] Only link jemalloc to the Cython extension where it's needed
  • ARROW-923 - Changelog generation Python script, add 0.1.0 and 0.2.0 changelog
  • ARROW-929 - Remove KEYS file from git
  • ARROW-943 - [GLib] Support running unit tests with source archive
  • ARROW-945 - [GLib] Add a Lua example to show Torch integration
  • ARROW-946 - [GLib] Use "new" instead of "open" for constructor name
  • ARROW-947 - [Python] Improve execution time of manylinux1 build
  • ARROW-953 - Use conda-forge cmake, curl in CI toolchain
  • ARROW-954 - Flag for compiling Arrow with header-only boost
  • ARROW-956 - [Python] compat with pandas >= 0.20.0
  • ARROW-957 - [Doc] Add HDFS and Windows documents to doxygen output
  • ARROW-961 - [Python] Rename InMemoryOutputStream to BufferOutputStream
  • ARROW-963 - [GLib] Add equal
  • ARROW-967 - [GLib] Support initializing array with buffer
  • ARROW-970 - [Python] Nicer experience if user accidentally calls pyarrow.Table ctor directly
  • ARROW-977 - [java] Add Timezone aware timestamp vectors
  • ARROW-980 - Fix detection of "msvc" COMPILER_FAMILY
  • ARROW-982 - [Website] Improve website front copy to highlight serialization efficiency benefits
  • ARROW-984 - [GLib] Add Go examples
  • ARROW-985 - [GLib] Update package information
  • ARROW-988 - [JS] Add entry to Travis CI matrix
  • ARROW-993 - [GLib] Add missing error checks in Go examples
  • ARROW-996 - [Website] Add 0.3.0 release announce in Japanese
  • ARROW-997 - [Java] Implementing transferPair for FixedSizeListVector
  • ARROW-1000 - [GLib] Move install document to Website
  • ARROW-1001 - [GLib] Unify writer files
  • ARROW-1002 - [C++] Fix inconsistency with padding at start of IPC file format
  • ARROW-1008 - [C++] Add abstract stream writer and reader C++ APIs. Give clearer names to IPC reader/writer classes
  • ARROW-1010 - [Website] Provide for translations without repeating blog post in blogroll
  • ARROW-1011 - [FORMAT] fix typo and mistakes in Layout.md
  • ARROW-1014 - 0.4.0 release
  • ARROW-1015 - [Java] Schema-level metadata
  • ARROW-1016 - Python: Include C++ headers (optionally) in wheels
  • ARROW-1022 - [Python] Add multithreaded read option to read_feather
  • ARROW-1024 - Python: Update build time numpy version to 1.10.1
  • ARROW-1025 - [Website] Improved changelog for website, include git shortlog
  • ARROW-1027 - [Python] Allow negative indexing in fields/columns on pyarrow Table and Schema objects
  • ARROW-1028 - [Python] Fix IPC docs per API changes
  • ARROW-1029 - [Python] Fixes for building pyarrow with Parquet support on MSVC. Add to appveyor build
  • ARROW-1030 - Python: Account for library versioning in parquet-cpp
  • ARROW-1031 - [GLib] Support pretty print
  • ARROW-1037 - [GLib] Follow reader name change
  • ARROW-1038 - [GLib] Follow writer name change
  • ARROW-1040 - [GLib] Support tensor IO
  • ARROW-1044 - [GLib] Support Feather
  • ARROW-1126 - Python: Add function to convert NumPy/Pandas dtypes to Arrow DataTypes
wesm
published 0.3.1 •

ptaylor
published 0.3.0 •

Changelog

Source

Apache Arrow 0.3.0 (2017-05-05)

Bug Fixes

  • ARROW-109 - [C++] Add nesting stress tests up to 500 recursion depth
  • ARROW-208 - Add checkstyle policy to java project
  • ARROW-347 - Add method to pass CallBack when creating a transfer pair
  • ARROW-413 - DATE type is not specified clearly
  • ARROW-431 - [Python] Review GIL release and acquisition in to_pandas conversion
  • ARROW-443 - [Python] Support ingest of strided NumPy arrays from pandas
  • ARROW-451 - [C++] Implement DataType::Equals as TypeVisitor. Add default implementations for TypeVisitor, ArrayVisitor methods
  • ARROW-454 - pojo.Field doesn't implement hashCode()
  • ARROW-526 - [Format] Revise Format documents for evolution in IPC stream / file / tensor formats
  • ARROW-565 - [C++] Examine "Field::dictionary" member
  • ARROW-570 - Determine Java tools JAR location from project metadata
  • ARROW-584 - [C++] Fix compiler warnings exposed with -Wconversion
  • ARROW-586 - Problem with reading parquet files saved by Apache Spark
  • ARROW-588 - [C++] Fix some 32 bit compiler warnings
  • ARROW-595 - [Python] Set schema attribute on StreamReader
  • ARROW-604 - Python: boxed Field instances are missing the reference to their DataType
  • ARROW-611 - [Java] TimeVector TypeLayout is incorrectly specified as 64 bit width
  • ARROW-613 - WIP TypeScript Implementation
  • ARROW-617 - [Format] Add additional Time metadata and comments based on discussion in ARROW-617
  • ARROW-619 - [Python] Fixed remaining typo for LD_LIBRARY_PATH
  • ARROW-619 - Fix typos in setup.py args and LD_LIBRARY_PATH
  • ARROW-623 - Fix segfault in repr of empty field
  • ARROW-624 - [C++] Restore MakePrimitiveArray function, use in feather.cc
  • ARROW-627 - [C++] Add compatibility macros for exported extern templates
  • ARROW-628 - [Python] Install nomkl metapackage when building parquet-cpp in Travis CI
  • ARROW-630 - [C++] Create boolean batches for IPC testing, properly account for nonzero offset
  • ARROW-636 - [C++] Update README about Boost system requirement
  • ARROW-639 - [C++] Invalid offset in slices
  • ARROW-642 - [Java] Remove temporary file in java/tools
  • ARROW-644 - Python: Cython should be a setup-only requirement
  • ARROW-652 - Remove trailing f in merge script output
  • ARROW-654 - [C++] Serialize timezone in IPC metadata
  • ARROW-666 - [Python] Error in DictionaryArray __repr__
  • ARROW-667 - build of arrow-master/cpp fails with altivec error?
  • ARROW-668 - [Python] Box timestamp values as pandas.Timestamp if available, attach tzinfo
  • ARROW-671 - [GLib] Install missing license file
  • ARROW-673 - [Java] Support additional Time metadata
  • ARROW-677 - [java] Fix checkstyle jcl-over-slf4j conflict issue
  • ARROW-678 - [GLib] Fix dependencies
  • ARROW-680 - [C++] Support CMake 2 or older again
  • ARROW-682 - [Integration] Check implementations against themselves
  • ARROW-683 - [C++/Python] Refactor to make Date32 and Date64 types for new metadata. Test IPC roundtrip
  • ARROW-685 - [GLib] AX_CXX_COMPILE_STDCXX_11 error running ./configure
  • ARROW-686 - [C++] Account for time metadata changes, add Time32 and Time64 types
  • ARROW-689 - [GLib] Fix install directories
  • ARROW-691 - [Java] Encode dictionary type in message format
  • ARROW-697 - JAVA Throw exception for record batches > 2GB
  • ARROW-699 - [C++] Resolve Arrow and Arrow IPC build issues on Windows;
  • ARROW-702 - fix BitVector.copyFromSafe to reAllocate instead of returning false
  • ARROW-703 - Fix issue where setValueCount(0) doesn’t work in the case that we’ve shipped vectors across the wire
  • ARROW-704 - Fix bad import caused by conflicting changes
  • ARROW-709 - [C++] Restore type comparator for DecimalType
  • ARROW-713 - [C++] Fix cmake linking issue in new IPC benchmark
  • ARROW-715 - [Python] Make pandas not a hard requirement, flake8 fixes
  • ARROW-716 - [Python] Update README build instructions after moving libpyarrow to C++ tree
  • ARROW-720 - arrow should not have a dependency on slf4j bridges in com…
  • ARROW-723 - [Python] Ensure that passing chunk_size=0 when writing Parquet file does not enter infinite loop
  • ARROW-726 - [C++] Fix segfault caused when passing non-buffer object to arrow::py::PyBuffer
  • ARROW-732 - [C++] Schema comparison bugs in struct and union types
  • ARROW-736 - [Python] Mixed-type object DataFrame columns should not silently co…
  • ARROW-738 - Fix manylinux1 build
  • ARROW-739 - Don't install jemalloc in parallel
  • ARROW-740 - FileReader fails for large objects
  • ARROW-747 - [C++] Calling add_dependencies with dl causes spurious CMake warning
  • ARROW-749 - [Python] Delete partially-written Feather file when column write fails
  • ARROW-753 - [Python] Fix linker error for python-test on OS X
  • ARROW-756 - [C++] MSVC build fixes and cleanup, remove -fPIC flag from EP builds on Windows, Dev docs
  • ARROW-757 - [C++] MSVC build fails on googletest when using NMake
  • ARROW-762 - [Python] Start docs page about files and filesystems, adapt C++ docs about HDFS
  • ARROW-776 - [GLib] Fix wrong type name
  • ARROW-777 - restore getObject behavior on Date and Time
  • ARROW-778 - Port merge tool to work on Windows
  • ARROW-780 - PYTHON_EXECUTABLE Required to be set during build
  • ARROW-781 - [C++/Python] Increase reference count of the numpy base array?
  • ARROW-783 - [Java/C++] Fixes for 0-length record batches
  • ARROW-787 - [GLib] Fix compilation error caused by introducing BooleanBuilder::Append overload
  • ARROW-789 - Fix issue where setValueCount(0) doesn’t work in the case that we’ve shipped vectors across the wire
  • ARROW-793 - [GLib] Fix indent
  • ARROW-794 - [C++/Python] Disallow strided tensors in ipc::WriteTensor
  • ARROW-796 - [Java] Checkstyle additions causing build failure in some environments
  • ARROW-797 - [Python] Make more explicitly curated public API page, sphinx cleanup
  • ARROW-800 - [C++] Boost headers being transitively included in pyarrow
  • ARROW-805 - [C++] Don't throw IOError when listing empty HDFS dir
  • ARROW-809 - [C++] Do not write excess bytes in IPC writer after slicing arrays
  • ARROW-812 - Pip install pyarrow on mac failed.
  • ARROW-817 - [Python] Fix comment in date32 conversion
  • ARROW-821 - [Python] Extra file _table_api.h generated during Python build process
  • ARROW-822 - [Python] StreamWriter Wrapper for Socket and File-like Objects without tell()
  • ARROW-826 - [C++/Python] Fix compilation error on Mac with -DARROW_PYTHON=on
  • ARROW-829 - Don't deactivate Parquet dictionary encoding on column-wis…
  • ARROW-830 - [Python] Expose jemalloc memory pool and other memory pool functions in public pyarrow API
  • ARROW-836 - add test for pandas conversion of timedelta, currently unimplemented
  • ARROW-839 - [Python] Use mktime variant that is reliable on MSVC
  • ARROW-847 - Specify BUILD_BYPRODUCTS for gtest
  • ARROW-852 - Also search for ARROW libs when pkg-config provided the path
  • ARROW-853 - [Python] Only set RPATH when bundling the shared libraries
  • ARROW-858 - Remove boost_regex from arrow dependencies
  • ARROW-866 - [Python] Be robust to PyErr_Fetch returning a null exc value
  • ARROW-867 - [Python] pyarrow MSVC fixes
  • ARROW-875 - Avoid setting an extra empty in fillEmpties()
  • ARROW-879 - compat with pandas v0.20.0
  • ARROW-882 - [C++] Rename statically build library on Windows to avoid …
  • ARROW-883 - [JAVA] Introduction of new types has shifted Enumerations
  • ARROW-885 - [Python/C++] Decimal test failure on MSVC
  • ARROW-886 - [Java] Fixing reallocation of VariableLengthVector offsets
  • ARROW-887 - add default value to units for backward compatibility
  • ARROW-888 - Transfer ownership of buffer in BitVector transferTo()
  • ARROW-895 - Fix lastSet in fillEmpties() and copyFrom()
  • ARROW-900 - [Python] Fix UnboundLocalError in ParquetDatasetPiece.read
  • ARROW-903 - [GLib] Remove a needless "."
  • ARROW-914 - [C++/Python] Fix Decimal ToBytes
  • ARROW-922 - Allow Flatbuffers and RapidJSON to be used locally on Windows
  • ARROW-927 - C++/Python: Add manylinux1 builds to Travis matrix
  • ARROW-928 - [C++] Detect supported MSVC versions
  • ARROW-933 - [Python] Remove debug print statement
  • ARROW-934 - [GLib] Glib sources missing from result of 02-source.sh
  • ARROW-936 - add missing file; revert tag change
  • ARROW-936 - fix release README
  • ARROW-938 - Fix Rat license warnings

New Features and Improvements

  • ARROW-6 - Hope to add development document
  • ARROW-39 - C++: Logical chunked arrays / columns: conforming to fixed chunk sizes
  • ARROW-52 - Set up project blog
  • ARROW-95 - Add Jekyll-based website publishing toolchain, migrate existing arrow-site
  • ARROW-98 - Java: API documentation
  • ARROW-99 - C++: Explore if RapidCheck may be helpful for testing / worth adding to toolchain
  • ARROW-183 - C++: Add storage type to DecimalType
  • ARROW-231 - [C++] : Add typed Resize to PoolBuffer
  • ARROW-281 - [C++] IPC/RPC support on Win32 platforms
  • ARROW-316 - [Format] Changes to Date metadata format per discussion in ARROW-316
  • ARROW-341 - [Python] Move pyarrow's C++ code to the main C++ source tree, install libarrow_python and headers
  • ARROW-452 - [C++/Python] Incorporate C++ and Python codebases for Feather file format
  • ARROW-459 - [C++] Dictionary IPC support in file and stream formats
  • ARROW-483 - [C++/Python] Provide access to "custom_metadata" Field attribute in IPC setting
  • ARROW-491 - [Format / C++] Add FixedWidthBinary type to format, C++ implementation
  • ARROW-492 - [C++] Add arrow/arrow.h public API
  • ARROW-493 - [C++] Permit large (length > INT32_MAX) arrays in memory
  • ARROW-502 - [C++/Python] : Logging memory pool
  • ARROW-510 - ARROW-582 ARROW-663 ARROW-729: [Java] Added units for Time and Date types, and integration tests
  • ARROW-518 - C++: Make Status::OK method constexpr
  • ARROW-520 - [C++] STL-compliant allocator
  • ARROW-528 - [Python] Utilize improved Parquet writer C++ API, add write_metadata function, test _metadata files
  • ARROW-534 - [C++] Add IPC tests for date/time after ARROW-452, fix bugs
  • ARROW-539 - [Python] Add support for reading partitioned Parquet files with Hive-like directory schemes
  • ARROW-542 - Adding dictionary encoding to FileWriter
  • ARROW-550 - [Format] Draft experimental Tensor flatbuffer message type
  • ARROW-552 - [Python] Implement getitem for DictionaryArray by returning a value from the dictionary
  • ARROW-557 - [Python] Add option to explicitly opt in to HDFS tests, do not implicitly skip
  • ARROW-563 - Support non-standard gcc version strings
  • ARROW-566 - Bundle Arrow libraries in Python package
  • ARROW-568 - [C++] Add default implementations for TypeVisitor, ArrayVisitor methods that return NotImplemented
  • ARROW-569 - [C++] Set version for *.pc
  • ARROW-574 - Python: Add support for nested Python lists in Pandas conversion
  • ARROW-576 - [C++] Complete file/stream implementation for union types
  • ARROW-577 - [C++] Use private implementation pattern in ipc::StreamWriter and ipc::FileWriter
  • ARROW-578 - [C++] Add -DARROW_CXXFLAGS=... option to make CMake more consistent
  • ARROW-580 - C++: Also provide jemalloc_X targets if only a static or shared version is found
  • ARROW-582 - [Java] Added JSON reader/writer unit test for date, time, and timestamp
  • ARROW-589 - C++: Use system provided shared jemalloc if static is unavailable
  • ARROW-591 - [C++] Add round trip testing fixture for JSON format
  • ARROW-593 - [C++] : Rename ReadableFileInterface to RandomAccessFile
  • ARROW-598 - [Python] Add support for converting pyarrow.Buffer to a memoryview with zero copy
  • ARROW-603 - [C++] Add RecordBatch::Validate method, call in RecordBatch ctor in debug builds
  • ARROW-605 - [C++] Refactor IPC adapter code into generic ArrayLoader class. Add Date32Type
  • ARROW-606 - [C++] upgrade flatbuffers version to 1.6.0
  • ARROW-608 - [Format] Days since epoch date type
  • ARROW-610 - [C++] Win32 compatibility in file.cc
  • ARROW-612 - [Java] Added not null to Field.toString output
  • ARROW-615 - [Java] Moved ByteArrayReadableSeekableByteChannel to src main o.a.a.vector.util
  • ARROW-616 - [C++] Do not include debug symbols in release builds by default
  • ARROW-618 - [Python/C++] Support timestamp+timezone conversion to pandas
  • ARROW-620 - [C++] Implement JSON integration test support for date, time, timestamp, fixed width binary
  • ARROW-621 - [C++] Start IPC benchmark suite for record batches, implement "inline" visitor. Code reorg
  • ARROW-625 - [C++] Add TimeUnit to TimeType::ToString. Add timezone to TimestampType::ToString if present
  • ARROW-626 - [Python] Replace PyBytesBuffer with zero-copy, memoryview-based PyBuffer
  • ARROW-631 - [GLib] Import
  • ARROW-632 - [Python] Add support for FixedWidthBinary type
  • ARROW-635 - [C++] Add JSON read/write support for FixedWidthBinary
  • ARROW-637 - [Format] Add timezone to Timestamp metadata, comments describing the semantics
  • ARROW-646 - [Python] Conda s3 robustness, set CONDA_PKGS_DIR env variable and add Travis CI caching
  • ARROW-647 - [C++] Use Boost shared libraries for tests and utilities
  • ARROW-648 - [C++] Support multiarch on Debian
  • ARROW-650 - [GLib] Follow ReadableFileInterface -> RnadomAccessFile change
  • ARROW-651 - [C++] Set version to shared library
  • ARROW-655 - [C++/Python] Implement DecimalArray
  • ARROW-656 - [C++] Add random access writer for a mutable buffer. Rename WriteableFileInterface to WriteableFile for better consistency
  • ARROW-657 - [C++/Python] Expose Tensor IPC in Python. Add equals method. Add pyarrow.create_memory_map/memory_map functions
  • ARROW-658 - [C++] Implement a prototype in-memory arrow::Tensor type
  • ARROW-659 - [C++] Add multithreaded memcpy implementation
  • ARROW-660 - [C++] Restore function that can read a complete encapsulated record batch message
  • ARROW-661 - [C++] Add LargeRecordBatch metadata type, IPC support, associated refactoring
  • ARROW-662 - [Format] Move Schema flatbuffers into their own file that can be included
  • ARROW-663 - [Java] Support additional Time metadata + vector value accessors
  • ARROW-664 - [C++] Make C++ Arrow serialization deterministic
  • ARROW-669 - [Python] Attach proper tzinfo when computing boxed scalars for TimestampArray
  • ARROW-670 - Arrow 0.3 release
  • ARROW-672 - [Format] Add MetadataVersion::V3 for Arrow 0.3
  • ARROW-674 - [Java] Support additional Timestamp timezone metadata
  • ARROW-675 - [GLib] Update package metadata
  • ARROW-676 - move from MinorType to FieldType in ValueVectors to carry all the relevant type bits
  • ARROW-679 - [Format] Change FieldNode, RecordBatch lengths to long, remove LargeRecordBatch. Refactoring
  • ARROW-681 - [C++] Disable boost's autolinking if shared boost is used …
  • ARROW-684 - [Python] More helpful error message if libparquet_arrow not built
  • ARROW-687 - [C++] Build and run full test suite in Appveyor
  • ARROW-688 - [C++] Use CMAKE_INSTALL_INCLUDEDIR for consistency
  • ARROW-690 - Only send JIRA updates to issues@arrow.apache.org
  • ARROW-698 - Add flag to FileWriter::WriteRecordBatch for writing record batches with lengths over INT32_MAX
  • ARROW-700 - Add headroom interface for allocator
  • ARROW-701 - [Java] Support Additional Date Type Metadata
  • ARROW-706 - [GLib] Add package install document
  • ARROW-707 - [Python] Return NullArray for array of all None in Array.from_pandas. Revert from_numpy -> from_pandas
  • ARROW-708 - [C++] Simplify metadata APIs to all use the Message class, perf analysis
  • ARROW-710 - [Python] Read/write with file-like Python objects from read_feather/write_feather
  • ARROW-711 - [C++] Remove extern template declarations for NumericArray<T> types
  • ARROW-712 - [C++] Reimplement Array::Accept as inline visitor
  • ARROW-717 - [C++] Implement IPC zero-copy round trip for tensors
  • ARROW-718 - [Python] Implement pyarrow.Tensor container, zero-copy NumPy roundtrips
  • ARROW-719 - [GLib] Release source archive
  • ARROW-722 - [Python] Support additional date/time types and metadata, conversion to/from NumPy and pandas.DataFrame
  • ARROW-724 - Add How to Contribute section to README
  • ARROW-725 - [Formats/Java] FixedSizeList message and java implementation
  • ARROW-727 - [Python] Ensure that NativeFile.write accepts any bytes, unicode, or object providing buffer protocol. Rename build_arrow_buffer to pyarrow.frombuffer
  • ARROW-728 - [C++/Python] Add Table::RemoveColumn method, remove name member, some other code cleaning
  • ARROW-729 - [Java] Add vector type for 32-bit date as days since UNIX epoch
  • ARROW-731 - [C++] Add shared library related versions to .pc
  • ARROW-733 - [C++/Python] Rename FixedWidthBinary to FixedSizeBinary for consistency with FixedSizeList
  • ARROW-734 - [C++/Python] Support building PyArrow on MSVC
  • ARROW-735 - [C++] Developer instruction document for MSVC on Windows
  • ARROW-737 - [C++] Enable mutable buffer slices, SliceMutableBuffer function
  • ARROW-741 - [Python] Switch Travis CI to use Python 3.6 instead of 3.5
  • ARROW-743 - [C++] Consolidate all but decimal array tests into array-test, collect some tests in type-test.cc
  • ARROW-744 - [GLib] Re-add an assertion for garrow_table_new() test
  • ARROW-745 - [C++] Allow use of system cpplint
  • ARROW-746 - [GLib] Add garrow_array_get_data_type()
  • ARROW-748 - [Python] Pin runtime library versions in conda-forge packages to force upgrades
  • ARROW-751 - [Python] Make all Cython modules private. Some code tidying
  • ARROW-752 - [Python] Support boxed Arrow arrays as input to DictionaryArray.from_arrays
  • ARROW-754 - [GLib] Add garrow_array_is_null()
  • ARROW-755 - [GLib] Add garrow_array_get_value_type()
  • ARROW-758 - [C++] Build with /WX in Appveyor, fix MSVC compiler warnings
  • ARROW-761 - [C++/Python] Add GetTensorSize method, Python bindings
  • ARROW-763 - C++: Use to find libpythonX.X.dylib
  • ARROW-765 - [Python] Add more natural Exception type hierarchy for thirdparty users
  • ARROW-768 - [Java] Change the "boxed" object representation of date and time types
  • ARROW-769 - [GLib] Support building without installed Arrow C++
  • ARROW-770 - [C++] Move .clang* files back into cpp source tree
  • ARROW-771 - [Python] Add read_row_group / num_row_groups to ParquetFile
  • ARROW-773 - [CPP] Add Table::AddColumn API
  • ARROW-774 - [GLib] Remove needless LICENSE.txt copy
  • ARROW-775 - add simple constructors to value vectors
  • ARROW-779 - [C++] Check for old metadata and raise exception if found
  • ARROW-782 - [C++] API cleanup, change public member access in DataType classes to functions, use class instead of struct
  • ARROW-788 - [C++] Align WriteTensor message
  • ARROW-795 - [C++] Consolidate arrow/arrow_io/arrow_ipc into a single shared and static library
  • ARROW-798 - [Docs] Publish Format Markdown documents somehow on arrow.apache.org
  • ARROW-802 - [GLib] Add read examples
  • ARROW-803 - [GLib] Update package repository URL
  • ARROW-804 - [GLib] Update build document
  • ARROW-806 - [GLib] Support add/remove a column from table
  • ARROW-807 - [GLib] Update "Since" tag
  • ARROW-808 - [GLib] Remove needless ignore entries
  • ARROW-810 - [GLib] Remove io/ipc prefix
  • ARROW-811 - [GLib] Add GArrowBuffer
  • ARROW-815 - [Java] Exposing reAlloc for ValueVector
  • ARROW-816 - [C++] Travis CI script cleanup, add C++ toolchain env with Flatbuffers, RapidJSON
  • ARROW-818 - [Python] Expand Sphinx API docs, pyarrow.* namespace. Add factory functions for time32, time64
  • ARROW-820 - [C++] Build dependencies for Parquet library without arrow…
  • ARROW-825 - [Python] Rename pyarrow.from_pylist to pyarrow.array, test on tuples
  • ARROW-827 - [Python] Miscellaneous improvements to help with Dask support
  • ARROW-828 - [C++] Add new dependency to README
  • ARROW-831 - Switch from boost::regex to std::regex
  • ARROW-832 - [C++] Update to gtest 1.8.0, remove now unneeded test_main.cc
  • ARROW-833 - [Python] Add Developer quickstart for conda users
  • ARROW-841 - [Python] Add pyarrow build to Appveyor
  • ARROW-844 - [Format] Update README documents in format/
  • ARROW-845 - [Python] Sync changes from PARQUET-955; explicit ARROW_HOME will override pkgconfig
  • ARROW-846 - [GLib] Add GArrowTensor, GArrowInt8Tensor and GArrowUInt8Tensor
  • ARROW-848 - [Python] Another pass on conda dev guide
  • ARROW-849 - [C++] Support setting production build dependencies with ARROW_BUILD_TOOLCHAIN
  • ARROW-857 - [Python] Automate publishing Python documentation to arrow-site
  • ARROW-859 - [C++] Do not build unit tests by default?
  • ARROW-860 - [C++] Remove typed Tensor containers
  • ARROW-861 - [Python] Move DEVELOPMENT.md to Sphinx docs
  • ARROW-862 - [Python] Simplify README landing documentation to direct users and developers toward the documentation
  • ARROW-863 - [GLib] Use GBytes to implement zero-copy
  • ARROW-864 - [GLib] Unify Array files
  • ARROW-865 - [Python] Add unit tests validating Parquet date/time type roundtrips
  • ARROW-868 - [GLib] Use GBytes to reduce copy
  • ARROW-869 - [JS] Rename directory to js/
  • ARROW-871 - [GLib] Unify DataType files
  • ARROW-876 - [GLib] Unify ArrayBuilder files
  • ARROW-877 - [GLib] Add garrow_array_get_null_bitmap()
  • ARROW-878 - [GLib] Add garrow_binary_array_get_buffer()
  • ARROW-880 - [GLib] Support getting raw data of primitive arrays
  • ARROW-890 - [GLib] Add GArrowMutableBuffer
  • ARROW-892 - [GLib] Fix GArrowTensor document
  • ARROW-893 - Add GLib document to Web site
  • ARROW-894 - [GLib] Add GArrowResizableBuffer and GArrowPoolBuffer
  • ARROW-896 - Support Jupyter Notebook in Web site
  • ARROW-898 - [C++/Python] Use shared_ptr to avoid copying KeyValueMetadata, add to Field type also
  • ARROW-904 - [GLib] Simplify error check codes
  • ARROW-907 - C++: Construct Table from schema and arrays
  • ARROW-908 - [GLib] Unify OutputStream files
  • ARROW-910 - [C++] Write 0 length at EOS in StreamWriter
  • ARROW-916 - [GLib] Add GArrowBufferOutputStream
  • ARROW-917 - [GLib] Add GArrowBufferReader
  • ARROW-918 - [GLib] Use GArrowBuffer for read buffer
  • ARROW-919 - [GLib] Use "id" to get type enum value from GArrowDataType
  • ARROW-920 - [GLib] Add Lua examples
  • ARROW-925 - [GLib] Fix GArrowBufferReader test
  • ARROW-926 - Add wesm to KEYS
  • ARROW-930 - javadoc generation fails with java 8
  • ARROW-931 - [GLib] Reconstruct input stream
  • ARROW-965 - Website updates for 0.3.0 release
ptaylor
published 0.2.0 •

Changelog

Source

Apache Arrow 0.2.0 (2017-02-18)

Bug Fixes

  • ARROW-112 - Changed constexprs to kValue naming.
  • ARROW-202 - Integrate with appveyor ci for windows
  • ARROW-220 - [C++] Build conda artifacts in a build environment with better cross-linux ABI compatibility
  • ARROW-224 - [C++] Address static linking of boost dependencies
  • ARROW-230 - Python: Do not name modules like native ones (i.e. rename pyarrow.io)
  • ARROW-239 - Test reading remainder of file in HDFS with read() with no args
  • ARROW-261 - Refactor String/Binary code paths to reflect unnested (non-list-based) structure
  • ARROW-273 - Lists use unsigned offset vectors instead of signed (as defined in the spec)
  • ARROW-275 - Add tests for UnionVector in Arrow File
  • ARROW-294 - [C++] Do not use platform-dependent fopen/fclose functions for MemoryMappedFile
  • ARROW-322 - [C++] Remove ARROW_HDFS option, always build the module
  • ARROW-323 - [Python] Opt-in to pyarrow.parquet extension rather than attempting and failing silently
  • ARROW-334 - [Python] Remove INSTALL_RPATH_USE_LINK_PATH
  • ARROW-337 - UnionListWriter.list() is doing more than it should, this …
  • ARROW-339 - [Dev] Lingering Python 3 fixes
  • ARROW-339 - Python 3 compatibility in merge_arrow_pr.py
  • ARROW-340 - [C++] Opening a writeable file on disk that already exists does not truncate to zero
  • ARROW-342 - Set Python version on release
  • ARROW-345 - libhdfs integration doesn't work for Mac
  • ARROW-346 - Use conda environment to build API docs
  • ARROW-348 - [Python] Add build-type command line option to setup.py, build CMake extensions in a build type subdirectory
  • ARROW-349 - Add six as a requirement
  • ARROW-351 - Time type has no unit
  • ARROW-354 - Fix comparison of arrays of empty strings
  • ARROW-357 - Use a single RowGroup for Parquet files as default.
  • ARROW-358 - Add explicit environment variable to locate libhdfs in one's environment
  • ARROW-362 - Fix memory leak in zero-copy arrow to NumPy/pandas conversion
  • ARROW-371 - Handle pandas-nullable types correctly
  • ARROW-375 - Fix unicode Python 3 issue in columns argument of parquet.read_table
  • ARROW-384 - Align Java and C++ RecordBatch data and metadata layout
  • ARROW-386 - [Java] Respect case of struct / map field names
  • ARROW-387 - [C++] Verify zero-copy Buffer slices from BufferReader retain reference to parent Buffer
  • ARROW-390 - Only specify dependencies for json-integration-test on ARROW_BUILD_TESTS=ON
  • ARROW-392 - [C++/Java] String IPC integration testing / fixes. Add array / record batch pretty-printing
  • ARROW-393 - [JAVA] JSON file reader fails to set the buffer size on String data vector
  • ARROW-395 - Arrow file format writes record batches in reverse order.
  • ARROW-398 - Java file format requires bitmaps of all 1's to be written…
  • ARROW-399 - ListVector.loadFieldBuffers ignores the ArrowFieldNode len…
  • ARROW-400 - set struct length on load
  • ARROW-401 - Floating point vectors should do an approximate comparison…
  • ARROW-402 - Fix reference counting issue with empty buffers. Close #232
  • ARROW-403 - [Java] Create transfer pairs for internal vectors in UnionVector transfer impl
  • ARROW-404 - [Python] Fix segfault caused by HdfsClient getting closed before an HdfsFile
  • ARROW-405 - Use vendored hdfs.h if not found in include/ in $HADOOP_HOME
  • ARROW-406 - [C++] Set explicit 64K HDFS buffer size, test large reads
  • ARROW-408 - Remove defunct conda recipes
  • ARROW-414 - [Java] "Buffer too large to resize to ..." error
  • ARROW-420 - Align DATE type with Java implementation
  • ARROW-421 - [Python] Retain parent reference in PyBytesReader
  • ARROW-422 - IPC should depend on rapidjson_ep if RapidJSON is vendored
  • ARROW-429 - Revert ARROW-379 until git-archive issues are resolved
  • ARROW-433 - Correctly handle Arrow to Python date conversion for timezones west of London
  • ARROW-434 - [Python] Correctly handle Python file objects in Parquet read/write paths
  • ARROW-435 - Fix spelling of RAPIDJSON_VENDORED
  • ARROW-437 - [C++} Fix clang compiler warning
  • ARROW-445 - arrow_ipc_objlib depends on Flatbuffer generated files
  • ARROW-447 - Always return unicode objects for UTF-8 strings
  • ARROW-455 - [C++] Add dtor to BufferOutputStream that calls Close()
  • ARROW-469 - C++: Add option so that resize doesn't decrease the capacity
  • ARROW-481 - [Python] Fix 2.7 regression in Parquet path to open file code path
  • ARROW-486 - [C++] Use virtual inheritance for diamond inheritance
  • ARROW-487 - Python: ConvertTableToPandas segfaults if ObjectBlock::Write fails
  • ARROW-494 - [C++] Extend lifetime of memory mapped data if any buffers reference it
  • ARROW-499 - Update file serialization to use the streaming serialization format.
  • ARROW-505 - [C++] Fix compiler warning in gcc in release mode
  • ARROW-511 - Python: Implement List conversions for single arrays
  • ARROW-513 - [C++] Fixing Appveyor / MSVC build
  • ARROW-516 - Building pyarrow with parquet
  • ARROW-519 - [C++] Refactor array comparison code into a compare.h / compare.cc in part to resolve Xcode 6.1 linker issue
  • ARROW-523 - Python: Account for changes in PARQUET-834
  • ARROW-533 - [C++] arrow::TimestampArray / TimeArray has a broken constructor
  • ARROW-535 - [Python] Add type mapping for NPY_LONGLONG
  • ARROW-537 - [C++] Do not compare String/Binary data in null slots when comparing arrays
  • ARROW-540 - [C++] Build fixes after ARROW-33, PARQUET-866
  • ARROW-543 - C++: Lazily computed null_counts counts number of non-null entries
  • ARROW-544 - [C++] Test writing zero-length record batches, zero-length BinaryArray fixes
  • ARROW-545 - [Python] Ignore non .parq/.parquet files when reading directories as Parquet datasets
  • ARROW-548 - [Python] Add nthreads to Filesystem.read_parquet and pass through
  • ARROW-551 - C++: Construction of Column with nullptr Array segfaults
  • ARROW-556 - [Integration] Configure C++ integration test executable with a single environment variable. Update README
  • ARROW-561 - [JAVA][PYTHON] Update java & python dependencies to improve downstream packaging experience
  • ARROW-562 - Mockito should be in test scope

New Features and Improvements

  • ARROW-33 - [C++] Implement zero-copy array slicing, integrate with IPC code paths
  • ARROW-81 - [Format] Augment dictionary encoding metadata to accommodate additional use cases
  • ARROW-96 - Add C++ API documentation
  • ARROW-97 - API documentation via sphinx-apidoc
  • ARROW-108 - [C++] Add Union implementation and IPC/JSON serialization tests
  • ARROW-189 - Build 3rd party with ExternalProject.
  • ARROW-191 - Python: Provide infrastructure for manylinux1 wheels
  • ARROW-221 - Add switch for writing Parquet 1.0 compatible logical types
  • ARROW-227 - [C++/Python] Hook arrow_io generic reader / writer interface into arrow_parquet
  • ARROW-228 - [Python] Create an Arrow-cpp-compatible interface for reading bytes from Python file-like objects
  • ARROW-240 - Installation instructions for pyarrow
  • ARROW-243 - [C++] Add option to switch between libhdfs and libhdfs3 when creating HdfsClient
  • ARROW-268 - [C++] Flesh out union implementation to have all required methods for IPC
  • ARROW-303 - [C++] Also build static libraries for leaf libraries
  • ARROW-312 - [Java] IPC file round trip tool for integration testing
  • ARROW-312 - Read and write Arrow IPC file format from Python
  • ARROW-317 - Add Slice, Copy methods to Buffer
  • ARROW-327 - [Python] Remove conda builds from Travis CI setup
  • ARROW-328 - Return shared_ptr<T> by value instead of const-ref
  • ARROW-330 - CMake functions to simplify shared / static library configuration
  • ARROW-332 - Add RecordBatch.to_pandas method
  • ARROW-333 - Make writers update their internal schema even when no data is written
  • ARROW-335 - Improve Type apis and toString() by encapsulating flatbuffers better
  • ARROW-336 - Run Apache Rat in Travis builds
  • ARROW-338 - Implement visitor pattern for IPC loading/unloading
  • ARROW-344 - Instructions for building with conda
  • ARROW-350 - Added Kerberos to HDFS client
  • ARROW-353 - Arrow release 0.2
  • ARROW-355 - Add tests for serialising arrays of empty strings to Parquet
  • ARROW-356 - Add documentation about reading Parquet
  • ARROW-359 - Document ARROW_LIBHDFS_DIR
  • ARROW-360 - C++: Add method to shrink PoolBuffer using realloc
  • ARROW-361 - Python: Support reading a column-selection from Parquet files
  • ARROW-363 - [Java/C++] integration testing harness, initial integration tests
  • ARROW-365 - Python: Provide Array.to_pandas()
  • ARROW-366 - Java Dictionary Vector
  • ARROW-367 - converter json <=> Arrow file format for Integration tests
  • ARROW-368 - Added note for LD_LIBRARY_PATH in Python README
  • ARROW-369 - [Python] Convert multiple record batches at once to Pandas
  • ARROW-370 - Python: Pandas conversion from `datetime.date` columns
  • ARROW-372 - json vector serialization format
  • ARROW-373 - [C++] JSON serialization format for testing
  • ARROW-374 - More precise handling of bytes vs unicode in Python API
  • ARROW-377 - Python: Add support for conversion of Pandas.Categorical
  • ARROW-379 - Use setuptools_scm for Python versioning
  • ARROW-380 - [Java] optimize null count when serializing vectors
  • ARROW-381 - [C++] Simplify primitive array type builders to use a default type singleton
  • ARROW-382 - Extend Python API documentation
  • ARROW-383 - [C++] Integration testing CLI tool
  • ARROW-389 - Python: Write Parquet files to pyarrow.io.NativeFile objects
  • ARROW-394 - [Integration] Generate tests cases for numeric types, strings, lists, structs
  • ARROW-396 - [Python] Add pyarrow.schema.Schema.equals
  • ARROW-409 - [Python] Change record batches conversion to Table
  • ARROW-410 - [C++] Add virtual Writeable::Flush
  • ARROW-411 - [Java] Move compactor functions in Integration to a separate Validator module
  • ARROW-415 - C++: Add Equals implementation to compare Tables
  • ARROW-416 - C++: Add Equals implementation to compare Columns
  • ARROW-417 - Add Equals implementation to compare ChunkedArrays
  • ARROW-418 - [C++] Array / Builder class code reorganization, flattening
  • ARROW-419 - [C++] Promote util/{status.h, buffer.h, memory-pool.h} to top level of arrow/ source directory
  • ARROW-423 - Define BUILD_BYPRODUCTS for CMake 3.2+
  • ARROW-425 - Add private API to get python Table from a C++ object
  • ARROW-426 - Python: Conversion from pyarrow.Array to a Python list
  • ARROW-427 - [C++] Implement dictionary array type
  • ARROW-428 - [Python] Multithreaded conversion from Arrow table to pandas.DataFrame
  • ARROW-430 - Improved version handling
  • ARROW-432 - [Python] Construct precise pandas BlockManager structure for zero-copy DataFrame initialization
  • ARROW-438 - [C++/Python] Implement zero-data-copy record batch and table concatenation.
  • ARROW-440 - [C++] Support pkg-config
  • ARROW-441 - [Python] Expose Arrow's file and memory map classes as NativeFile subclasses
  • ARROW-442 - [Python] Inspect Parquet file metadata from Python
  • ARROW-444 - [Python] Native file reads into pre-allocated memory. Some IO API cleanup / niceness
  • ARROW-449 - Python: Conversion from pyarrow.{Table,RecordBatch} to a Python dict
  • ARROW-450 - Fixes for PARQUET-818
  • ARROW-456 - Add jemalloc based MemoryPool
  • ARROW-457 - Python: Better control over memory pool
  • ARROW-458 - [Python] Expose jemalloc MemoryPool
  • ARROW-461 - [Python] Add Python interfaces to DictionaryArray data, pandas interop
  • ARROW-463 - C++: Support jemalloc 4.x
  • ARROW-466 - Add ExternalProject for jemalloc
  • ARROW-467 - [Python] Run Python parquet-cpp unit tests in Travis CI
  • ARROW-468 - Python: Conversion of nested data in pd.DataFrames
  • ARROW-470 - [Python] Add "FileSystem" abstraction to access directories of files in a uniform way
  • ARROW-471 - [Python] Enable ParquetFile to pass down separately-obtained file metadata
  • ARROW-472 - [Python] Expose more C++ IO interfaces. Add equals methods to Parquet schemas. Pass Parquet metadata separately in reader
  • ARROW-474 - [Java] Add initial version of streaming serialized format.
  • ARROW-475 - [Python] Add support for reading multiple Parquet files as a single pyarrow.Table
  • ARROW-476 - Add binary integration test fixture, add Java support
  • ARROW-477 - [Java] Add support for second/microsecond/nanosecond timestamps in-memory and in IPC/JSON layer
  • ARROW-478 - Consolidate BytesReader and BufferReader to accept PyBytes or Buffer
  • ARROW-479 - Python: Test for expected schema in Pandas conversion
  • ARROW-484 - Revise README to include more detail about software components
  • ARROW-485 - [Java] Users are required to initialize VariableLengthVectors.offsetVector before calling VariableLengthVectors.mutator.getSafe
  • ARROW-490 - Python: Update manylinux1 build scripts
  • ARROW-495 - [C++] Implement streaming binary format, refactoring
  • ARROW-497 - Integration harness for streaming file format
  • ARROW-498 - [C++] Add command line utilities that convert between stream and file.
  • ARROW-503 - [Python] Implement Python interface to streaming file format
  • ARROW-506 - Java: Implement echo server for integration testing.
  • ARROW-508 - [C++] Add basic threadsafety to normal files and memory maps
  • ARROW-509 - [Python] Add support for multithreaded Parquet reads
  • ARROW-512 - C++: Add method to check for primitive types
  • ARROW-514 - [Python] Automatically wrap pyarrow.io.Buffer in BufferReader
  • ARROW-515 - [Python] Add read_all methods to FileReader, StreamReader
  • ARROW-521 - [C++] Track peak allocations in default memory pool
  • ARROW-524 - provide apis to access nested vectors and buffers
  • ARROW-525 - Python: Add more documentation to the package
  • ARROW-527 - Remove drill-module.conf file
  • ARROW-529 - Python: Add jemalloc and Python 3.6 to manylinux1 build
  • ARROW-531 - Python: Document jemalloc, extend Pandas section, add Getting Involved
  • ARROW-538 - [C++] Set up AddressSanitizer (ASAN) builds
  • ARROW-546 - Python: Account for changes in PARQUET-867
  • ARROW-547 - [Python] Add zero-copy slice methods to Array, RecordBatch
  • ARROW-553 - C++: Faster valid bitmap building
  • ARROW-558 - Add KEYS files
SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc