0.3-5 2023-11-29 * add format casts 0.3-4 2023-11-28 * pass max.size argument through chunk.map() (#39) * minor change to work around rchk not being able to follow protections across functions. 0.3-3 2022-12-09 * fix error/segfault (depending on R version) in as.output() when a type that doesn't support LENGTH() is passed (such as NULL). * CH.MAX.SIZE was ignored in chunk.apply() for parallel jobs * add CH.BINARY flag which can be set to TRUE if the merge step should be performed continually as a call to a binary CH.MERGE function instead of collecting all results and then calling CH.MERGE. Analogously, CH.INITIAL has been added which is a function called on the first result. If NULL then CH.MERGE(NULL, result) is called instead. Note: in previous versions regular chunk.apply() was behaving like CH.BINARY=FALSE, but when parallel was set then it behaved like CH.BINARY=TRUE. Now CH.BINARY is explicit. * new parallel chunk.apply() implementation The related arguments have been re-named to avoid clashes with actual function arguments. CH.MERGE now behaves the same way as with sequential processing for consistency. CH.PARALLEL - if set to 2 or higher triggers parallel processing of chunks CH.SEQUENTIAL - if FALSE then parallel processing is allowed to change the order of the chunks to process chunks yield results faster frist. 0.3-2 2021-07-23 * minor changes for compatibility with write-barrier and R-devel (no functional difference) 0.3-1 2020-03-10 * make sure connections are closed in examples so check doesn't complain * add PROTECT() to chunk.apply() and string singletons 0.3-0 2020-03-09 * integers incorrectly parsed empty strings to 0 instead of NA (#27) * add as.output.raw() which supports both direct file descriptors and connections * Extend the handling of as.output() as.output() now supports three modes: 1) con=NULL: a raw vector is created 2) con=connection: writes output to binary connection 3) con=iotools.stderr/stdout/fd(fd): writes directly to a file descriptor Also as.output() is now pass-through for raw vectors. Finally, most methods now support keys to be either a logical value to suppress names/row names or it can also be a character vector in which case its content is used as keys. 0.2-6 2018-02-05 * add support for logical vectors in fdrbind 0.2-5 2018-01-24 * disable non-blocking raw fd reads on Windows since select() does NOT work on FDs there. 0.2-4 2017-04-13 * remove unnecessary reference to stdout * increase tmeporary buffer to (hopefully) appease gcc7 * add stdout_writeBin C code * add fdrbind() 0.2-3 2016-09-16 * fix a bug in timeout parameter of read.chunk() where subsecod timeouts were computed incorrectly 0.2-2 2016-04-26 * add support for raw file descriptors and timeout in the chunk reader 0.2-1 2015-08-20 * use R_GetConnection() API in R >=3.3.0 * add chunk.map to mimic hmr locally * fix col.names handing in write.csv.raw() (#26) * clean up as_output_matrix to be 64-bit safe * use internal C methods for all output support ragged lists (with recycling) and long vectors in as_output_dataframe * support I() to tag ojebcts that don't want to use as.character() * make string coersion rules consistent * re-factor as.output.data.frame to use dybuf * support binary connection con in as.output() instead of buffering * add support for quoting via quote= parameter (#25) 0.1-12 2015-07-28 * don't import parallel::mc* since it doesn't exist on Windows 0.1-11 2015-07-28 * fix issues, mostly convert to 64-bit 0.1-10 2015-06-22 * remove old stdio API * add quoting to read.csv.raw * support quotes in character fields (#24) 0.1-9 2015-06-22 * fix handing of Windows line endings (#23) 0.1-8 2015-03-18 * add support for iterators - imstrsplit/idstrsplit (Thanks to Mike Kane! - #19) * add tests and fixes to make them run on edge cases * fix mstrsplit when given length zero input * re-factor as.output() to use dynamic buffers 0.1-7 2015-02-10 * add C implementations of as.output() 0.1-6 2015-02-08 * support tab/comma separated files with as.output() when x is a data.frame or matrix * make loading hmr silently the default until we rename hmr and go to CRAN * fix header=TRUE bug * treat NAs in dstrsplit list input as a way to skip columns 0.1-5 2014-12-15 * Removed "pipeline" parameter for chunk.apply and updated the documentation * Parallel option added to chunk.apply() * major re-structuring of the raw parsers (dstrsplit and mstrsplit) ------------------------------------------------------------------------ previous versions included code for Hadoop Map/Reduce, that code has now been moved to a separate package: https://github.com/s-u/hmr ------------------------------------------------------------------------ 0.1-4 2014-06-09 * support names from colspec, support list colspec * add experimental remote submission capability 0.1-3 2014-05-20 * add hadoop.opt option and hadoop.conf support 0.1-2 * fix missing PROTECT in chunk.tapply 0.1-1 2014-04-17 * add key-awareness when splitting * add ctapply() - more efficient implementation of tapply() for contiguous keys * add support for Hadoop 2.x 0.1-0 2013-05-23 * initial public release