1. Excel2003与Excel2007
两个版本的最大行数和列数不同,2003版最大行数是65536行,最大列数是256列,2007版及以后的版本最大行数是1048576行,最大列数是16384列。
excel2003是以二进制的方式存储,这种格式不易被其他软件读取使用;而excel2007采用了基于XML的ooxml开放文档标准,ooxml使用XML和ZIP技术结合进行文件存储,XML是一个基于文本的格式,而且ZIP容器支持内容的压缩,所以其一大优势是可以大大减小文件的尺寸。
2. 大批量数据读写

2.1 大批量数据写入
对于大数据的Xlsx文件的写入,POI3.8提供了SXSSFSXSSFWorkbook类,采用缓存方式进行大批量写文件。
详情可以查看poi官网示例:
http://poi.apache.org/spreadsheet/how-to.html#sxssf
2.2 大批量数据读取
POI读取Excel有两种模式,一种是用户模式,一种是SAX事件驱动模式,将xlsx格式的文档转换成CSV格式后进行读取。用户模式API接口丰富,使用POI的API可以很容易读取Excel,但用户模式消耗的内存很大,当遇到很大sheet、大数据网格,假空行、公式等问题时,很容易导致内存溢出。POI官方推荐解决内存溢出的方式使用CVS格式解析,即SAX事件驱动模式。下面主要是讲解如何读取大批量数据:
2.2.1 pom.xml所需jar包
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63
| package org.poi;
import java.util.List;
public class ExcelReaderUtil { public static final String EXCEL03_EXTENSION = ".xls"; public static final String EXCEL07_EXTENSION = ".xlsx";
public static void sendRows(String filePath, String sheetName, int sheetIndex, int curRow, List<String> cellList) { StringBuffer oneLineSb = new StringBuffer(); oneLineSb.append(filePath); oneLineSb.append("--"); oneLineSb.append("sheet" + sheetIndex); oneLineSb.append("::" + sheetName); oneLineSb.append("--"); oneLineSb.append("row" + curRow); oneLineSb.append("::"); for (String cell : cellList) { oneLineSb.append(cell.trim()); oneLineSb.append("|"); } String oneLine = oneLineSb.toString(); if (oneLine.endsWith("|")) { oneLine = oneLine.substring(0, oneLine.lastIndexOf("|")); }
System.out.println(oneLine); }
public static void readExcel(String fileName) throws Exception { int totalRows =0; if (fileName.endsWith(EXCEL03_EXTENSION)) { ExcelXlsReader excelXls=new ExcelXlsReader(); totalRows =excelXls.process(fileName); } else if (fileName.endsWith(EXCEL07_EXTENSION)) { ExcelXlsxReader excelXlsxReader = new ExcelXlsxReader(); totalRows = excelXlsxReader.process(fileName); } else { throw new Exception("文件格式错误,fileName的扩展名只能是xls或xlsx。"); } System.out.println("发送的总行数:" + totalRows); }
public static void main(String[] args) throws Exception { String path="C:\\Users\\y****\\Desktop\\TestSample\\H_20171226_***_*****_0430.xlsx"; ExcelReaderUtil.readExcel(path); } }
|
2.2.2 POI以SAX解析excel2007文件
解决思路:通过继承DefaultHandler类,重写process(),startElement(),characters(),endElement()这四个方法。process()方式主要是遍历所有的sheet,并依次调用startElement()、characters()方法、endElement()这三个方法。startElement()用于设定单元格的数字类型(如日期、数字、字符串等等)。characters()用于获取该单元格对应的索引值或是内容值(如果单元格类型是字符串、INLINESTR、数字、日期则获取的是索引值;其他如布尔值、错误、公式则获取的是内容值)。endElement()根据startElement()的单元格数字类型和characters()的索引值或内容值,最终得出单元格的内容值,并打印出来。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63
| package org.poi;
import java.util.List;
public class ExcelReaderUtil { public static final String EXCEL03_EXTENSION = ".xls"; public static final String EXCEL07_EXTENSION = ".xlsx";
public static void sendRows(String filePath, String sheetName, int sheetIndex, int curRow, List<String> cellList) { StringBuffer oneLineSb = new StringBuffer(); oneLineSb.append(filePath); oneLineSb.append("--"); oneLineSb.append("sheet" + sheetIndex); oneLineSb.append("::" + sheetName); oneLineSb.append("--"); oneLineSb.append("row" + curRow); oneLineSb.append("::"); for (String cell : cellList) { oneLineSb.append(cell.trim()); oneLineSb.append("|"); } String oneLine = oneLineSb.toString(); if (oneLine.endsWith("|")) { oneLine = oneLine.substring(0, oneLine.lastIndexOf("|")); }
System.out.println(oneLine); }
public static void readExcel(String fileName) throws Exception { int totalRows =0; if (fileName.endsWith(EXCEL03_EXTENSION)) { ExcelXlsReader excelXls=new ExcelXlsReader(); totalRows =excelXls.process(fileName); } else if (fileName.endsWith(EXCEL07_EXTENSION)) { ExcelXlsxReader excelXlsxReader = new ExcelXlsxReader(); totalRows = excelXlsxReader.process(fileName); } else { throw new Exception("文件格式错误,fileName的扩展名只能是xls或xlsx。"); } System.out.println("发送的总行数:" + totalRows); }
public static void main(String[] args) throws Exception { String path="C:\\Users\\y****\\Desktop\\TestSample\\H_20171226_***_*****_0430.xlsx"; ExcelReaderUtil.readExcel(path); } }
|
2.2.3 POI通过继承HSSFListener类来解决Excel2003文件
解决思路:重写process(),processRecord()两个方法,其中processRecord是核心方法,用于处理sheetName和各种单元格数字类型。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63
| package org.poi;
import java.util.List;
public class ExcelReaderUtil { public static final String EXCEL03_EXTENSION = ".xls"; public static final String EXCEL07_EXTENSION = ".xlsx";
public static void sendRows(String filePath, String sheetName, int sheetIndex, int curRow, List<String> cellList) { StringBuffer oneLineSb = new StringBuffer(); oneLineSb.append(filePath); oneLineSb.append("--"); oneLineSb.append("sheet" + sheetIndex); oneLineSb.append("::" + sheetName); oneLineSb.append("--"); oneLineSb.append("row" + curRow); oneLineSb.append("::"); for (String cell : cellList) { oneLineSb.append(cell.trim()); oneLineSb.append("|"); } String oneLine = oneLineSb.toString(); if (oneLine.endsWith("|")) { oneLine = oneLine.substring(0, oneLine.lastIndexOf("|")); }
System.out.println(oneLine); }
public static void readExcel(String fileName) throws Exception { int totalRows =0; if (fileName.endsWith(EXCEL03_EXTENSION)) { ExcelXlsReader excelXls=new ExcelXlsReader(); totalRows =excelXls.process(fileName); } else if (fileName.endsWith(EXCEL07_EXTENSION)) { ExcelXlsxReader excelXlsxReader = new ExcelXlsxReader(); totalRows = excelXlsxReader.process(fileName); } else { throw new Exception("文件格式错误,fileName的扩展名只能是xls或xlsx。"); } System.out.println("发送的总行数:" + totalRows); }
public static void main(String[] args) throws Exception { String path="C:\\Users\\y****\\Desktop\\TestSample\\H_20171226_***_*****_0430.xlsx"; ExcelReaderUtil.readExcel(path); } }
|
2.2.4 辅助类ExcelReaderUtil
调用ExcelXlsReader类和ExcelXlsxReader类对excel2003和excel2007两个版本进行大批量数据读取:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63
| package org.poi;
import java.util.List;
public class ExcelReaderUtil { public static final String EXCEL03_EXTENSION = ".xls"; public static final String EXCEL07_EXTENSION = ".xlsx";
public static void sendRows(String filePath, String sheetName, int sheetIndex, int curRow, List<String> cellList) { StringBuffer oneLineSb = new StringBuffer(); oneLineSb.append(filePath); oneLineSb.append("--"); oneLineSb.append("sheet" + sheetIndex); oneLineSb.append("::" + sheetName); oneLineSb.append("--"); oneLineSb.append("row" + curRow); oneLineSb.append("::"); for (String cell : cellList) { oneLineSb.append(cell.trim()); oneLineSb.append("|"); } String oneLine = oneLineSb.toString(); if (oneLine.endsWith("|")) { oneLine = oneLine.substring(0, oneLine.lastIndexOf("|")); }
System.out.println(oneLine); }
public static void readExcel(String fileName) throws Exception { int totalRows =0; if (fileName.endsWith(EXCEL03_EXTENSION)) { ExcelXlsReader excelXls=new ExcelXlsReader(); totalRows =excelXls.process(fileName); } else if (fileName.endsWith(EXCEL07_EXTENSION)) { ExcelXlsxReader excelXlsxReader = new ExcelXlsxReader(); totalRows = excelXlsxReader.process(fileName); } else { throw new Exception("文件格式错误,fileName的扩展名只能是xls或xlsx。"); } System.out.println("发送的总行数:" + totalRows); }
public static void main(String[] args) throws Exception { String path="C:\\Users\\y****\\Desktop\\TestSample\\H_20171226_***_*****_0430.xlsx"; ExcelReaderUtil.readExcel(path); } }
|
github地址:
https://github.com/SwordfallYeung/POIExcel
参考资料
https://www.cnblogs.com/huangjian2/p/6238237.html
https://www.cnblogs.com/yfrs/p/5689347.html
http://blog.csdn.net/lishengbo/article/details/40711769
https://www.cnblogs.com/wshsdlau/p/5643847.html
http://blog.csdn.net/lipinganq/article/details/78775195
http://blog.csdn.net/lipinganq/article/details/53389501
http://blog.csdn.net/zmx729618/article/details/72639037
http://blog.csdn.net/daiyutage/article/details/53010491