Class CsvReaderVisitorWithUTF8HeadersBase

Intermediate base class for CSV reader visitors that don't want to have to implement header handling by themselves.

Instances of this class are tied to a single CSV stream and cannot be reused or reset for use with other CSV streams.

Each instance of this visitor has an upper-bound on the maximum number of headers and on the maximum length of each header. CSV streams that exceed these limits will cause this class to throw exceptions, and behavior of a particular instance is undefined once this happens.

Inheritance

Object

CsvReaderVisitorBase

CsvReaderVisitorWithUTF8HeadersBase

Inherited Members

CsvReaderVisitorBase.Null

CsvReaderVisitorBase.VisitNonstandardQuotedField()

Object.Equals(Object)

Object.Equals(Object, Object)

Object.GetHashCode()

Object.GetType()

Object.MemberwiseClone()

Object.ReferenceEquals(Object, Object)

Object.ToString()

Namespace: Cursively

Assembly: Cursively.dll

Syntax

public abstract class CsvReaderVisitorWithUTF8HeadersBase : CsvReaderVisitorBase

Remarks

The following input-dependent exceptions may get thrown when using this visitor, all of which inherit from CursivelyDataStreamException:

CursivelyHeadersAreNotUTF8Exception if DefaultDecoderFallback is being used and the CSV stream contains a sequence of invalid UTF-8 bytes.
CursivelyHeaderIsTooLongException if the CSV stream contains one or more headers that are longer than the configured maximum.
CursivelyTooManyHeadersException if the CSV stream contains more headers than the configured maximum.
CursivelyMissingDataFieldsException, by default, if a data record contains more fields than the header record.
CursivelyExtraDataFieldsException, by default, if a data record contains more fields than the header record.

Constructors

| Improve this Doc View Source

CsvReaderVisitorWithUTF8HeadersBase()

Initializes a new instance of the CsvReaderVisitorWithUTF8HeadersBase class.

Declaration

[Obsolete("Use the parameterized constructor, passing in 'false' for the flag to ignore a UTF-8 identifier on the first header field; instead, remove UTF-8 identifiers on the input itself.  See airbreather/Cursively#14.")]
protected CsvReaderVisitorWithUTF8HeadersBase()

| Improve this Doc View Source

CsvReaderVisitorWithUTF8HeadersBase(Int32, Int32, Boolean, DecoderFallback)

Initializes a new instance of the CsvReaderVisitorWithUTF8HeadersBase class.

Declaration

protected CsvReaderVisitorWithUTF8HeadersBase(int maxHeaderCount, int maxHeaderLength, bool ignoreUTF8IdentifierOnFirstHeaderField, DecoderFallback decoderFallback)

Parameters

Type	Name	Description
Int32	maxHeaderCount	The maximum number of headers to allow. Default: DefaultMaxHeaderCount.
Int32	maxHeaderLength	The maximum length, in UTF-16 code units, of any particular header. Default: DefaultMaxHeaderLength.
Boolean	ignoreUTF8IdentifierOnFirstHeaderField	A value indicating whether or not to ignore a leading UTF-8 BOM. Default: DefaultIgnoreUTF8IdentifierOnFirstHeaderField. This parameter was a mistake (see airbreather/Cursively#14) and will be removed in 2.x. Instead, always pass in false, and remove UTF-8 identifiers directly at the source instead of leaving it up to the visitor.
DecoderFallback	decoderFallback	The fallback logic used when the decoder encounters invalid UTF-8 bytes. Default: DefaultDecoderFallback.

Exceptions

Type	Condition
ArgumentNullException	Thrown when `decoderFallback` is null.
ArgumentOutOfRangeException	Thrown when `maxHeaderCount` or `maxHeaderLength` is less than 1 or greater than the maximum for that parameter (MaxMaxHeaderCount / MaxMaxHeaderLength).

Fields

| Improve this Doc View Source

DefaultDecoderFallback

The value used by CsvReaderVisitorWithUTF8HeadersBase() to initialize the fallback logic when the decoder encounters invalid UTF-8 bytes (throw an exception).

Declaration

protected static readonly DecoderFallback DefaultDecoderFallback

Field Value

Type	Description
DecoderFallback

| Improve this Doc View Source

DefaultIgnoreUTF8IdentifierOnFirstHeaderField

The value used by CsvReaderVisitorWithUTF8HeadersBase() to initialize the value indicating whether or not to ignore a leading UTF-8 BOM (true).

Declaration

[Obsolete("Always pass in 'false' instead, per airbreather/Cursively#14")]
protected static readonly bool DefaultIgnoreUTF8IdentifierOnFirstHeaderField

Field Value

Type	Description
Boolean

| Improve this Doc View Source

DefaultMaxHeaderCount

The value used by CsvReaderVisitorWithUTF8HeadersBase() to initialize the maximum number of headers (1,000).

Declaration

protected static readonly int DefaultMaxHeaderCount

Field Value

Type	Description
Int32

| Improve this Doc View Source

DefaultMaxHeaderLength

The value used by CsvReaderVisitorWithUTF8HeadersBase() to initialize the maximum length, in UTF-16 code units, of a single header (100).

Declaration

protected static readonly int DefaultMaxHeaderLength

Field Value

Type	Description
Int32

| Improve this Doc View Source

MaxMaxHeaderCount

The maximum value that's legal for the maximum header count (0x7FEFFFFF).

Staying within this limit does not guarantee that you will be immune to OutOfMemoryException even with enough system virtual memory (that depends on your configuration). This is just the threshold that, if exceeded, guarantees that you actually *will* see OutOfMemoryException on mainstream frameworks if Cursively actually tried to go that high, so this is used as a "fail-fast".

Declaration

protected static readonly int MaxMaxHeaderCount

Field Value

Type	Description
Int32

| Improve this Doc View Source

MaxMaxHeaderLength

The maximum value that's legal for the maximum header length (0x7FEFFFFF).

Declaration

protected static readonly int MaxMaxHeaderLength

Field Value

Type	Description
Int32

Properties

| Improve this Doc View Source

CurrentFieldIndex

Gets the zero-based index of the field that is currently being read. The value should be the length of Headers during VisitEndOfHeaderRecord() and VisitEndOfDataRecord(), except after VisitMissingDataFields() or VisitUnexpectedDataField() has been called.

Declaration

protected int CurrentFieldIndex { get; }

Property Value

Type	Description
Int32

| Improve this Doc View Source

Headers

Gets the headers of the CSV stream.

Only valid after VisitEndOfHeaderRecord() has been called.

Declaration

protected ImmutableArray<string> Headers { get; }

Property Value

Type	Description
ImmutableArray<String>

Remarks

Once initialized, the value will remain the same for as long as this object instance stays alive.

Exceptions

Type	Condition
InvalidOperationException	Thrown when trying to access this value before VisitEndOfHeaderRecord() has been called.

Methods

| Improve this Doc View Source

VisitEndOfDataField(ReadOnlySpan<Byte>)

Visits the last part of a non-header field's data.

Declaration

protected abstract void VisitEndOfDataField(ReadOnlySpan<byte> chunk)

Parameters

Type	Name	Description
ReadOnlySpan<Byte>	chunk	The data from the last part of the field.

Remarks

See documentation for VisitEndOfField(ReadOnlySpan<Byte>) for details about when and how this method will be called.

| Improve this Doc View Source

VisitEndOfDataRecord()

Notifies that all fields in the current non-header record have been visited.

Declaration

protected abstract void VisitEndOfDataRecord()

Remarks

See documentation for VisitEndOfRecord() for details about when and how this method will be called.

| Improve this Doc View Source

VisitEndOfField(ReadOnlySpan<Byte>)

Visits the last part of a field's data.

Declaration

public override sealed void VisitEndOfField(ReadOnlySpan<byte> chunk)

Parameters

Type	Name	Description
ReadOnlySpan<Byte>	chunk	The data from the last part of the field.

Overrides

CsvReaderVisitorBase.VisitEndOfField(ReadOnlySpan<Byte>)

Remarks

This method may be called at any time.

Any method except VisitNonstandardQuotedField(), including this one, may be called directly after a call to this method.

This method may be called without a preceding VisitPartialFieldContents(ReadOnlySpan<Byte>) call, if the field's entire data is contained within the given chunk.

| Improve this Doc View Source

VisitEndOfHeaderRecord()

Notifies that all headers have been read and Headers is safe to read.

The default behavior is to do nothing.

Declaration

protected virtual void VisitEndOfHeaderRecord()

| Improve this Doc View Source

VisitEndOfRecord()

Notifies that all fields in the current record have been visited.

Declaration

public override sealed void VisitEndOfRecord()

Overrides

CsvReaderVisitorBase.VisitEndOfRecord()

Remarks

This method may only be called as the very next method that gets called after a call to VisitEndOfField(ReadOnlySpan<Byte>).

Only VisitPartialFieldContents(ReadOnlySpan<Byte>) and VisitEndOfField(ReadOnlySpan<Byte>) may be called directly after a call to this method.

| Improve this Doc View Source

VisitMissingDataFields()

Notifies that the current non-header record is about to be terminated without reading all the fields that were identified in the header record.

The default behavior is to throw CursivelyMissingDataFieldsException.

Declaration

protected virtual void VisitMissingDataFields()

| Improve this Doc View Source

VisitPartialDataFieldContents(ReadOnlySpan<Byte>)

Visits part of a non-header field's data.

Declaration

protected abstract void VisitPartialDataFieldContents(ReadOnlySpan<byte> chunk)

Parameters

Type	Name	Description
ReadOnlySpan<Byte>	chunk	The data from this part of the field.

Remarks

See documentation for VisitPartialFieldContents(ReadOnlySpan<Byte>) for details about when and how this method will be called.

| Improve this Doc View Source

VisitPartialFieldContents(ReadOnlySpan<Byte>)

Visits part of a field's data.

Declaration

public override sealed void VisitPartialFieldContents(ReadOnlySpan<byte> chunk)

Parameters

Type	Name	Description
ReadOnlySpan<Byte>	chunk	The data from this part of the field.

Overrides

CsvReaderVisitorBase.VisitPartialFieldContents(ReadOnlySpan<Byte>)

Remarks

This method may be called at any time.

Only VisitPartialFieldContents(ReadOnlySpan<Byte>), VisitEndOfField(ReadOnlySpan<Byte>), and VisitNonstandardQuotedField() may be called directly after a call to this method.

There are multiple reasons why this method may be called instead of going straight to calling VisitEndOfField(ReadOnlySpan<Byte>):

Field is split across multiple read buffer chunks, or else it runs up to the very end of a read buffer chunk, but we can't prove it without the first byte of the next chunk or a ProcessEndOfStream(CsvReaderVisitorBase) call.
Quoted field contains a literal quote that was escaped in the original stream, and so we cannot yield the entire field data as-is.
Stream does not conform to RFC 4180, and optimizing such streams to avoid this case.

| Improve this Doc View Source

VisitUnexpectedDataField()

Notifies that data for a field is about to be read on a non-header record, but all the fields that were identified in the header record have already been read.

This method is called before every single VisitPartialDataFieldContents(ReadOnlySpan<Byte>) or VisitEndOfDataField(ReadOnlySpan<Byte>) call for fields not present in the header record.

The default behavior is to throw CursivelyExtraDataFieldsException.

Declaration

protected virtual void VisitUnexpectedDataField()