Class ParquetBackedPBEventFileStream
java.lang.Object
edu.stanford.slac.archiverappliance.plain.parquet.ParquetBackedPBEventFileStream
- All Implemented Interfaces:
ETLParquetFilesStream,Closeable,AutoCloseable,Iterable<Event>,RemotableOverRaw,ETLBulkStream,EventStream
public class ParquetBackedPBEventFileStream
extends Object
implements ETLParquetFilesStream, RemotableOverRaw
An
EventStream implementation that reads data from one or more Parquet files.
This class serves two primary purposes:
- Data Retrieval: It can stream events from a list of Parquet files, applying time-based filters using Parquet's predicate pushdown for efficient querying.
- Optimized ETL: It implements
ETLParquetFilesStream, allowing it to act as a logical concatenation of multiple source files. TheParquetETLInfoListProcessoruses this capability to combine smaller Parquet files (e.g., hourly) into larger ones (e.g., daily) without fully deserializing and re-serializing the data, significantly improving ETL performance.
- See Also:
-
Constructor Summary
ConstructorsConstructorDescriptionParquetBackedPBEventFileStream(String pvName, Path path, ArchDBRTypes type) ParquetBackedPBEventFileStream(String pvName, Path path, ArchDBRTypes type, ParquetInfo fileInfo) ParquetBackedPBEventFileStream(String pvName, List<Path> paths, ArchDBRTypes type, Instant startTime, Instant endTime) ParquetBackedPBEventFileStream(String pvName, List<Path> paths, ArchDBRTypes type, Instant startTime, Instant endTime, ParquetInfo fileInfo) -
Method Summary
Modifier and TypeMethodDescriptionvoidclose()getFirstEvent(BasicContext context) Get the first event in this event stream.Get parquet first file infogetPaths()Get parquet file pathsiterator()toString()Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, waitMethods inherited from interface java.lang.Iterable
forEach, spliterator
-
Constructor Details
-
ParquetBackedPBEventFileStream
-
ParquetBackedPBEventFileStream
public ParquetBackedPBEventFileStream(String pvName, Path path, ArchDBRTypes type, ParquetInfo fileInfo) -
ParquetBackedPBEventFileStream
-
ParquetBackedPBEventFileStream
public ParquetBackedPBEventFileStream(String pvName, List<Path> paths, ArchDBRTypes type, Instant startTime, Instant endTime, ParquetInfo fileInfo)
-
-
Method Details
-
getFirstFileInfo
Description copied from interface:ETLParquetFilesStreamGet parquet first file info- Specified by:
getFirstFileInfoin interfaceETLParquetFilesStream- Returns:
- First parquet file info
-
close
- Specified by:
closein interfaceAutoCloseable- Specified by:
closein interfaceCloseable- Throws:
IOException
-
toString
-
iterator
-
getDescription
- Specified by:
getDescriptionin interfaceEventStream- Specified by:
getDescriptionin interfaceRemotableOverRaw
-
getPaths
Description copied from interface:ETLParquetFilesStreamGet parquet file paths- Specified by:
getPathsin interfaceETLParquetFilesStream- Returns:
- List of paths to parquet files
-
getFirstEvent
Description copied from interface:ETLBulkStreamGet the first event in this event stream. If there are no events in this stream, return null.- Specified by:
getFirstEventin interfaceETLBulkStream- Parameters:
context- BasicContext- Returns:
- Event return the first event, or null
- Throws:
IOException-
-
getFirstEvent
-