Class CsvReaderVisitorWithUTF8HeadersBase
Intermediate base class for CSV reader visitors that don't want to have to implement header handling by themselves.
Instances of this class are tied to a single CSV stream and cannot be reused or reset for use with other CSV streams.
Each instance of this visitor has an upper-bound on the maximum number of headers and on the maximum length of each header. CSV streams that exceed these limits will cause this class to throw exceptions, and behavior of a particular instance is undefined once this happens.
Inherited Members
Namespace: Cursively
Assembly: Cursively.dll
Syntax
public abstract class CsvReaderVisitorWithUTF8HeadersBase : CsvReaderVisitorBase
Remarks
The following input-dependent exceptions may get thrown when using this visitor, all of which inherit from CursivelyDataStreamException:
- CursivelyHeadersAreNotUTF8Exception if DefaultDecoderFallback is being used and the CSV stream contains a sequence of invalid UTF-8 bytes.
- CursivelyHeaderIsTooLongException if the CSV stream contains one or more headers that are longer than the configured maximum.
- CursivelyTooManyHeadersException if the CSV stream contains more headers than the configured maximum.
- CursivelyMissingDataFieldsException, by default, if a data record contains more fields than the header record.
- CursivelyExtraDataFieldsException, by default, if a data record contains more fields than the header record.
Constructors
| Improve this Doc View SourceCsvReaderVisitorWithUTF8HeadersBase()
Initializes a new instance of the CsvReaderVisitorWithUTF8HeadersBase class.
Declaration
[Obsolete("Use the parameterized constructor, passing in 'false' for the flag to ignore a UTF-8 identifier on the first header field; instead, remove UTF-8 identifiers on the input itself. See airbreather/Cursively#14.")]
protected CsvReaderVisitorWithUTF8HeadersBase()
CsvReaderVisitorWithUTF8HeadersBase(Int32, Int32, Boolean, DecoderFallback)
Initializes a new instance of the CsvReaderVisitorWithUTF8HeadersBase class.
Declaration
protected CsvReaderVisitorWithUTF8HeadersBase(int maxHeaderCount, int maxHeaderLength, bool ignoreUTF8IdentifierOnFirstHeaderField, DecoderFallback decoderFallback)
Parameters
Type | Name | Description |
---|---|---|
Int32 | maxHeaderCount | The maximum number of headers to allow. Default: DefaultMaxHeaderCount. |
Int32 | maxHeaderLength | The maximum length, in UTF-16 code units, of any particular header. Default: DefaultMaxHeaderLength. |
Boolean | ignoreUTF8IdentifierOnFirstHeaderField | A value indicating whether or not to ignore a leading UTF-8 BOM. Default: DefaultIgnoreUTF8IdentifierOnFirstHeaderField. This parameter was a mistake (see airbreather/Cursively#14) and will be removed in 2.x. Instead, always pass in false, and remove UTF-8 identifiers directly at the source instead of leaving it up to the visitor. |
DecoderFallback | decoderFallback | The fallback logic used when the decoder encounters invalid UTF-8 bytes. Default: DefaultDecoderFallback. |
Exceptions
Type | Condition |
---|---|
ArgumentNullException | Thrown when |
ArgumentOutOfRangeException | Thrown when |
Fields
| Improve this Doc View SourceDefaultDecoderFallback
The value used by CsvReaderVisitorWithUTF8HeadersBase() to initialize the fallback logic when the decoder encounters invalid UTF-8 bytes (throw an exception).
Declaration
protected static readonly DecoderFallback DefaultDecoderFallback
Field Value
Type | Description |
---|---|
DecoderFallback |
DefaultIgnoreUTF8IdentifierOnFirstHeaderField
The value used by CsvReaderVisitorWithUTF8HeadersBase() to initialize the value indicating whether or not to ignore a leading UTF-8 BOM (true).
Declaration
[Obsolete("Always pass in 'false' instead, per airbreather/Cursively#14")]
protected static readonly bool DefaultIgnoreUTF8IdentifierOnFirstHeaderField
Field Value
Type | Description |
---|---|
Boolean |
DefaultMaxHeaderCount
The value used by CsvReaderVisitorWithUTF8HeadersBase() to initialize the maximum number of headers (1,000).
Declaration
protected static readonly int DefaultMaxHeaderCount
Field Value
Type | Description |
---|---|
Int32 |
DefaultMaxHeaderLength
The value used by CsvReaderVisitorWithUTF8HeadersBase() to initialize the maximum length, in UTF-16 code units, of a single header (100).
Declaration
protected static readonly int DefaultMaxHeaderLength
Field Value
Type | Description |
---|---|
Int32 |
MaxMaxHeaderCount
The maximum value that's legal for the maximum header count (0x7FEFFFFF).
Staying within this limit does not guarantee that you will be immune to OutOfMemoryException even with enough system virtual memory (that depends on your configuration). This is just the threshold that, if exceeded, guarantees that you actually *will* see OutOfMemoryException on mainstream frameworks if Cursively actually tried to go that high, so this is used as a "fail-fast".
Declaration
protected static readonly int MaxMaxHeaderCount
Field Value
Type | Description |
---|---|
Int32 |
MaxMaxHeaderLength
The maximum value that's legal for the maximum header length (0x7FEFFFFF).
Staying within this limit does not guarantee that you will be immune to OutOfMemoryException even with enough system virtual memory (that depends on your configuration). This is just the threshold that, if exceeded, guarantees that you actually *will* see OutOfMemoryException on mainstream frameworks if Cursively actually tried to go that high, so this is used as a "fail-fast".
Declaration
protected static readonly int MaxMaxHeaderLength
Field Value
Type | Description |
---|---|
Int32 |
Properties
| Improve this Doc View SourceCurrentFieldIndex
Gets the zero-based index of the field that is currently being read. The value should be the length of Headers during VisitEndOfHeaderRecord() and VisitEndOfDataRecord(), except after VisitMissingDataFields() or VisitUnexpectedDataField() has been called.
Declaration
protected int CurrentFieldIndex { get; }
Property Value
Type | Description |
---|---|
Int32 |
Headers
Gets the headers of the CSV stream.
Only valid after VisitEndOfHeaderRecord() has been called.
Declaration
protected ImmutableArray<string> Headers { get; }
Property Value
Type | Description |
---|---|
ImmutableArray<String> |
Remarks
Once initialized, the value will remain the same for as long as this object instance stays alive.
Exceptions
Type | Condition |
---|---|
InvalidOperationException | Thrown when trying to access this value before VisitEndOfHeaderRecord() has been called. |
Methods
| Improve this Doc View SourceVisitEndOfDataField(ReadOnlySpan<Byte>)
Visits the last part of a non-header field's data.
Declaration
protected abstract void VisitEndOfDataField(ReadOnlySpan<byte> chunk)
Parameters
Type | Name | Description |
---|---|---|
ReadOnlySpan<Byte> | chunk | The data from the last part of the field. |
Remarks
See documentation for VisitEndOfField(ReadOnlySpan<Byte>) for details about when and how this method will be called.
VisitEndOfDataRecord()
Notifies that all fields in the current non-header record have been visited.
Declaration
protected abstract void VisitEndOfDataRecord()
Remarks
See documentation for VisitEndOfRecord() for details about when and how this method will be called.
VisitEndOfField(ReadOnlySpan<Byte>)
Visits the last part of a field's data.
Declaration
public override sealed void VisitEndOfField(ReadOnlySpan<byte> chunk)
Parameters
Type | Name | Description |
---|---|---|
ReadOnlySpan<Byte> | chunk | The data from the last part of the field. |
Overrides
Remarks
This method may be called at any time.
Any method except VisitNonstandardQuotedField(), including this one, may be called directly after a call to this method.
This method may be called without a preceding VisitPartialFieldContents(ReadOnlySpan<Byte>) call, if the field's entire data is contained within the given chunk.
VisitEndOfHeaderRecord()
Notifies that all headers have been read and Headers is safe to read.
The default behavior is to do nothing.
Declaration
protected virtual void VisitEndOfHeaderRecord()
VisitEndOfRecord()
Notifies that all fields in the current record have been visited.
Declaration
public override sealed void VisitEndOfRecord()
Overrides
Remarks
This method may only be called as the very next method that gets called after a call to VisitEndOfField(ReadOnlySpan<Byte>).
Only VisitPartialFieldContents(ReadOnlySpan<Byte>) and VisitEndOfField(ReadOnlySpan<Byte>) may be called directly after a call to this method.
VisitMissingDataFields()
Notifies that the current non-header record is about to be terminated without reading all the fields that were identified in the header record.
The default behavior is to throw CursivelyMissingDataFieldsException.
Declaration
protected virtual void VisitMissingDataFields()
VisitPartialDataFieldContents(ReadOnlySpan<Byte>)
Visits part of a non-header field's data.
Declaration
protected abstract void VisitPartialDataFieldContents(ReadOnlySpan<byte> chunk)
Parameters
Type | Name | Description |
---|---|---|
ReadOnlySpan<Byte> | chunk | The data from this part of the field. |
Remarks
See documentation for VisitPartialFieldContents(ReadOnlySpan<Byte>) for details about when and how this method will be called.
VisitPartialFieldContents(ReadOnlySpan<Byte>)
Visits part of a field's data.
Declaration
public override sealed void VisitPartialFieldContents(ReadOnlySpan<byte> chunk)
Parameters
Type | Name | Description |
---|---|---|
ReadOnlySpan<Byte> | chunk | The data from this part of the field. |
Overrides
Remarks
This method may be called at any time.
Only VisitPartialFieldContents(ReadOnlySpan<Byte>), VisitEndOfField(ReadOnlySpan<Byte>), and VisitNonstandardQuotedField() may be called directly after a call to this method.
There are multiple reasons why this method may be called instead of going straight to calling VisitEndOfField(ReadOnlySpan<Byte>):
- Field is split across multiple read buffer chunks, or else it runs up to the very end of a read buffer chunk, but we can't prove it without the first byte of the next chunk or a ProcessEndOfStream(CsvReaderVisitorBase) call.
- Quoted field contains a literal quote that was escaped in the original stream, and so we cannot yield the entire field data as-is.
- Stream does not conform to RFC 4180, and optimizing such streams to avoid this case.
VisitUnexpectedDataField()
Notifies that data for a field is about to be read on a non-header record, but all the fields that were identified in the header record have already been read.
This method is called before every single VisitPartialDataFieldContents(ReadOnlySpan<Byte>) or VisitEndOfDataField(ReadOnlySpan<Byte>) call for fields not present in the header record.
The default behavior is to throw CursivelyExtraDataFieldsException.
Declaration
protected virtual void VisitUnexpectedDataField()