PyPI

简介

lineless_table_rec库源于阿里读光-LORE无线表格结构识别模型

在这里,我们做的工作主要包括以下两点:

  1. 将模型转换为ONNX格式,便于部署
  2. 完善后处理代码,与OCR识别模型整合,可以保证输出结果为完整的表格和对应的内容

模型转换ONNX

详情参考:ConvertLOREToONNX

安装

  pip install lineless_table_rec
  

使用

查看效果

识别结果(点击展开)
  <html>
<body>
    <table>
        <tbody>
            <tr>
                <td rowspan="1" colspan="1">姓名</td>
                <td rowspan="1" colspan="1">年龄</td>
                <td rowspan="1" colspan="1">性别</td>
                <td rowspan="1" colspan="1">身高/m</td>
                <td rowspan="1" colspan="1">体重/kg</td>
                <td rowspan="1" colspan="1">BMI/(kg/m²)</td>
            </tr>
            <tr>
                <td rowspan="1" colspan="1">Duke</td>
                <td rowspan="1" colspan="1">34</td>
                <td rowspan="1" colspan="1">男</td>
                <td rowspan="1" colspan="1">1.74</td>
                <td rowspan="1" colspan="1">70</td>
                <td rowspan="1" colspan="1">23</td>
            </tr>
            <tr>
                <td rowspan="1" colspan="1">Ella</td>
                <td rowspan="1" colspan="1">26</td>
                <td rowspan="1" colspan="1">女</td>
                <td rowspan="1" colspan="1">1.60</td>
                <td rowspan="1" colspan="1">58</td>
                <td rowspan="1" colspan="1">23</td>
            </tr>
            <tr>
                <td rowspan="1" colspan="1">Eartha</td>
                <td rowspan="1" colspan="1"></td>
                <td rowspan="1" colspan="1">女</td>
                <td rowspan="1" colspan="1">1.34</td>
                <td rowspan="1" colspan="1">29</td>
                <td rowspan="1" colspan="1">16</td>
            </tr>
            <tr>
                <td rowspan="1" colspan="1">Thelonious</td>
                <td rowspan="1" colspan="1">6</td>
                <td rowspan="1" colspan="1">男</td>
                <td rowspan="1" colspan="1">1.07</td>
                <td rowspan="1" colspan="1">17</td>
                <td rowspan="1" colspan="1">15</td>
            </tr>
            <tr>
                <td rowspan="1" colspan="1">TARO</td>
                <td rowspan="1" colspan="1">22</td>
                <td rowspan="1" colspan="1">男</td>
                <td rowspan="1" colspan="1">1.728</td>
                <td rowspan="1" colspan="1">65</td>
                <td rowspan="1" colspan="1">21.7</td>
            </tr>
            <tr>
                <td rowspan="1" colspan="1">HANAKO</td>
                <td rowspan="1" colspan="1">22</td>
                <td rowspan="1" colspan="1">女</td>
                <td rowspan="1" colspan="1">1.60</td>
                <td rowspan="1" colspan="1">53</td>
                <td rowspan="1" colspan="1">20.7</td>
            </tr>
            <tr>
                <td rowspan="1" colspan="1">NARMAN</td>
                <td rowspan="1" colspan="1">38</td>
                <td rowspan="1" colspan="1">男</td>
                <td rowspan="1" colspan="1">1.76</td>
                <td rowspan="1" colspan="1">73</td>
                <td rowspan="1" colspan="1"></td>
            </tr>
            <tr>
                <td rowspan="1" colspan="1">NAOMI</td>
                <td rowspan="1" colspan="1">23</td>
                <td rowspan="1" colspan="1">女</td>
                <td rowspan="1" colspan="1">1.63</td>
                <td rowspan="1" colspan="1">60</td>
                <td rowspan="1" colspan="1"></td>
            </tr>
        </tbody>
    </table>
</body>

</html>
  

Last updated 12 Sep 2024, 19:30 +0800 . history