
Product
Socket Now Supports pylock.toml Files
Socket now supports pylock.toml, enabling secure, reproducible Python builds with advanced scanning and full alignment with PEP 751's new standard.
ExcelDataReaderExtractor
Advanced tools
Open Source library specialized in processing, validating and extracting data from files in Excel format. Supports fields validations and conversion of the extracted data to a specific type.
Open Source library specialized in processing, validating and extracting data from files in Excel format. Supports field validations and conversion to a specific type.
Available on Nuget
https://www.nuget.org/packages/ExcelDataReaderExtractor/
You can do DI using the interface
IExcelDataReaderExtractor
which is implemented by
ExcelDataReaderExtractor
The library extract the sheet data into the generic form
IEnumerable<IEnumerable<Dictionary<string, object?>>>
Each item represents a sheet. Each sheet contains a sequence of Dictionary, a single Dictionary represents only one row of the sheet. The key of the Dictionary is the column name, and the value is the stored on the current field.
It provides methods to convert each Dictionary element into a specific object T type, to do this is necessary the properties that this T type has, contains JsonPropertyAttribute (or similar, if necessary) or ExcelFieldAttribute for matching with columns names that are stored as keys of the dictionary.
Newtonsoft.Json is used to convert the objects.
IEnumerable<IEnumerable<Dictionary<string, object?>>> ProcessExtractData(byte[] byteArrayContent, bool excludeSheetsWithNoneOrOneRows);
Params:
Returns An IEnumerable where each item represents a sheet. Each sheet contains a sequence of Dictionary, a single Dictionary represents only one row of the sheet. The key of the Dictionary is the column name, and the value is the stored on the current field.
IEnumerable<IEnumerable<Dictionary<string, object?>>> ProcessExtractData(byte[] byteArrayContent, IEnumerable<ExcelSheetField> fields, bool ignoreUnindicatedFields, bool excludeSheetsWithNoneOrOneRows);
Params:
Returns an IEnumerable where each item represents a sheet. Each sheet contains a sequence of Dictionary, a single Dictionary represents only one row of the sheet. The key of the Dictionary is the column name, and the value is the stored on the current field.
IEnumerable<T> ProcessExtractDataSheet<T>(byte[] byteArrayContent, IEnumerable<ExcelField> fields, bool ignoreUnindicatedFields, int sheetIndex = 0);
Params:
Returns an IEnumerable containing the rows converted into the output type.
List<T> ProcessExtractDataSheet<T>(byte[] byteArrayContent, bool ignoreUnindicatedFields, int sheetIndex = 0);
Params:
Returns an IEnumerable containing the rows converted into the output type.
You can find examples on the test project
IExcelDataReaderExtractor _excelDataReaderExtractor = new ExcelDataReaderExtractor();
public void Extract_All_Data_No_Convert_Model()
{
IEnumerable<IEnumerable<Dictionary<string, object?>>> excelData;
excelData = _excelDataReaderExtractor.ProcessExtractData(_thirdSheetHasValuesContent, excludeSheetsWithNoneOrOneRows: false);
Assert.True(excelData.Count() == 3 && excelData.Last().Count() == 1);
}
public void Extract_Data_Excluding_Sheets_With_None_One_Row()
{
IEnumerable<IEnumerable<Dictionary<string, object?>>> excelData;
excelData = _excelDataReaderExtractor.ProcessExtractData(_thirdSheetHasValuesContent, excludeSheetsWithNoneOrOneRows: true);
Assert.True(excelData.Count() == 1 && excelData.First().Count() == 1);
}
public void Extract_Data_Validate_Fields_No_Convert_Model()
{
IEnumerable<IEnumerable<Dictionary<string, object?>>> excelData;
List<ExcelSheetField> fields = new()
{
new()
{
ColumnName = "FirstColumnNumber",
Required = true,
Type = DataTypes.Integer,
SheetIndex = 0,
},
new()
{
ColumnName = "SecondColumnStringSecondSheet",
Required = true,
Type= DataTypes.String,
SheetIndex = 1
}
};
int firstColumnFirstSheetValue = 1;
string secondColumnSecondSheetValue = "fifth value";
excelData = _excelDataReaderExtractor.ProcessExtractData(_dataOnTwoSheetsContent, fields, ignoreUnindicatedFields: true, excludeSheetsWithNoneOrOneRows: false);
Assert.True(excelData.Count() == 2 &&
excelData.First().Any(firstSheet => (int)firstSheet["FirstColumnNumber"]! == firstColumnFirstSheetValue) &&
excelData.Last().Any(secondSheet => secondSheet["SecondColumnStringSecondSheet"]!.ToString() == secondColumnSecondSheetValue));
}
Output type T examples
internal class ExcelDataRow
{
[JsonProperty("FirstColumnNumber")]
public int FirstColumn { get; set; }
[JsonProperty("SecondColumnString")]
public string SecondColumn { get; set; } = null!;
}
internal class ExcelDataRowWithFieldAttribute
{
[ExcelField(columnName: "FirstColumnNumber", required: true, type: DataTypes.Integer)]
public int FirstColumn { get; set; }
[ExcelField(columnName: "SecondColumnString", required: true, type: DataTypes.String)]
public string SecondColumn { get; set; } = null!;
}
public void Extract_Data_Sheet_Fields_Convert_Model()
{
IEnumerable<ExcelDataRow> excelDataSheet;
List<ExcelField> fields = new()
{
new()
{
ColumnName = "FirstColumnNumber",
Required = true,
Type = DataTypes.Integer
},
new ()
{
ColumnName = "SecondColumnString",
Required = true,
Type = DataTypes.String
}
};
excelDataSheet = _excelDataReaderExtractor.ProcessExtractDataSheet<ExcelDataRow>(_columnsWithDataContent, fields: fields, ignoreUnindicatedFields: true);
Assert.True(excelDataSheet.Count() == 2);
}
public void Extract_Data_Second_Sheet_Fields_Convert_Model()
{
IEnumerable<ExcelDataRowSecondSheet> excelDataSheet;
List<ExcelField> fields = new()
{
new()
{
ColumnName = "FirstColumnNumberSecondSheet",
Required = true,
Type = DataTypes.Integer
},
new()
{
ColumnName = "SecondColumnStringSecondSheet",
Required = false,
Type = DataTypes.String
}
};
int firstColumnValue = 5;
string secondColumnValue = "fifth value";
excelDataSheet = _excelDataReaderExtractor.ProcessExtractDataSheet<ExcelDataRowSecondSheet>(_dataOnTwoSheetsContent, fields: fields, ignoreUnindicatedFields: false, sheetIndex: 1);
Assert.Contains(excelDataSheet, x => x.FirstColumn == firstColumnValue && x.SecondColumn == secondColumnValue);
}
public void Extract_Data_Sheet_Convert_Model_With_Fields_Attribute()
{
IEnumerable<ExcelDataRowWithFieldAttribute> excelDataSheet;
excelDataSheet = _excelDataReaderExtractor.ProcessExtractDataSheet<ExcelDataRowWithFieldAttribute>(_columnsWithDataContent, ignoreUnindicatedFields: true);
Assert.NotEmpty(excelDataSheet);
}
Inherits from ColumnException
Exception thrown when the sheet does not contains all columns names given by ExcelField, ExcelSheetField or ExcelFieldAttribute.
Exception thrown when ignoreUnindicatedFields is false, the process validate all the columns names of the sheet and the sheet has more columns names than the given by ExcelField, ExcelSheetField or ExcelFieldAttribute.
Exception thrown when a sheet as repeated columns names.
Inherits from FieldException
Exception thrown when the given field ExcelField, ExcelSheetField or ExcelFieldAttribute has no column name.
Exception thrown when given field ExcelField, ExcelSheetField or ExcelFieldAttribute has DataTypes value not existing in the enum.
Exception thrown when a field has value but its column name is missing.
Exception thrown when the field type value is different from the one given by DataTypes value.
Exception thrown when using the method to extract data and convert into a specific object type T without indicating the list of fields apart. In this case, the type T must include the ExcelFieldAttribute in all of its properties.
Exception thrown when a field is required and has no value.
Inherits from SheetException
Exception thrown when the sheet has no rows.
Exception thrown when a sheet has only one row.
Exception thrown when the sheet index provided does not exists in the file.
Inherits from FileException
Exception thrown when trying to process an unsupported file.
Exception thrown when the sheet/s of the file are empty or have only one row.
FAQs
Open Source library specialized in processing, validating and extracting data from files in Excel format. Supports fields validations and conversion of the extracted data to a specific type.
We found that exceldatareaderextractor demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Product
Socket now supports pylock.toml, enabling secure, reproducible Python builds with advanced scanning and full alignment with PEP 751's new standard.
Security News
Research
Socket uncovered two npm packages that register hidden HTTP endpoints to delete all files on command.
Research
Security News
Malicious Ruby gems typosquat Fastlane plugins to steal Telegram bot tokens, messages, and files, exploiting demand after Vietnam’s Telegram ban.