Perl(或Python)和Excel有没有办法确定单元格中多行文本中使用的字体类型?

Perl(或Python)和Excel有没有办法确定单元格中多行文本中使用的字体类型?,excel,perl,formatting,ole,strikethrough,Excel,Perl,Formatting,Ole,Strikethrough,我的数据来自Excel文件,其中一些单元格包含字符串,这些字符串包含以前版本的数据,这些数据表示为删除线字符。我知道如何使用Perl和OLE解析/操作Excel文件,但我只看到在单元格级别可以访问文本格式。是否有一种方法可以逐个字符访问格式?我们的目标是找到并删除所有格式为删除线的文本。这是一个VBA解决方案,因为我的机器上没有安装Python。希望它能显示访问单个字符格式的方法 以下是范围(“A1”): 给出输出: strikethrough at character 17 striketh

我的数据来自Excel文件,其中一些单元格包含字符串,这些字符串包含以前版本的数据,这些数据表示为删除线字符。我知道如何使用Perl和OLE解析/操作Excel文件,但我只看到在单元格级别可以访问文本格式。是否有一种方法可以逐个字符访问格式?我们的目标是找到并删除所有格式为删除线的文本。

这是一个VBA解决方案,因为我的机器上没有安装Python。希望它能显示访问单个字符格式的方法

以下是
范围(“A1”)

给出输出:

strikethrough at character 17
strikethrough at character 18
strikethrough at character 19

使用
电子表格::ParseExcel
访问单个单元格以及具有多种格式的复杂单元格。复杂单元格将使用富文本格式,您可以使用
$cell->get_Rich_Text()
方法访问该格式。下面是一个示例,用于在单个单元格中以及作为多格式单元格的一部分(改编自的大纲)中查找删除项格式

parse\u lazy\u dog.pl

#!/usr/bin/env perl

use warnings;
use strict;

use Spreadsheet::ParseExcel;

my $file = 'lazy_dog.xls';
my $parser   = Spreadsheet::ParseExcel->new();
my $workbook = $parser->parse($file);

if ( !defined $workbook ) {
    die $parser->error(), ".\n";
}

for my $worksheet ( $workbook->worksheets() ) {

    my ( $row_min, $row_max ) = $worksheet->row_range();
    my ( $col_min, $col_max ) = $worksheet->col_range();

    for my $row ( $row_min .. $row_max ) {
        for my $col ( $col_min .. $col_max ) {

            my $cell = $worksheet->get_cell( $row, $col );
            next unless $cell;

            print "Row, Col          = ($row, $col)\n";
            print "Value             = ", $cell->value(),       "\n";
            print "Unformatted Value = ", $cell->unformatted(), "\n";

            if ( my $rich = $cell->get_rich_text() ) {
                # Multiple formats inside one cell
                print "     STRIKEOUT ->   ";                
                my $pos = 0;
                for my $rich_elem (@$rich) {
                    my ($char_pos, $font) = @$rich_elem;
                    if ($font->{Strikeout}) {
                        while ($pos++ < $char_pos) {
                            print " ";
                        }
                    } else {
                        while ($pos++ <= $char_pos) {
                            print "^";
                        }
                    }
                }
                print "\n";

            } else {
                # Entire cell has same format
                my $format = $cell->get_format();
                my $is_strikeout = $format->{Font}->{Strikeout};
                if ($is_strikeout) {
                    print "     STRIKEOUT ->   ";
                    print "^"x(length($cell->unformatted()));
                    print "\n";
                }
                print "\n";
            }
        }
    }
}

看起来您可以使用Win32::OLE访问单个字符和字符范围:
#!/usr/bin/env perl

use warnings;
use strict;

use Spreadsheet::ParseExcel;

my $file = 'lazy_dog.xls';
my $parser   = Spreadsheet::ParseExcel->new();
my $workbook = $parser->parse($file);

if ( !defined $workbook ) {
    die $parser->error(), ".\n";
}

for my $worksheet ( $workbook->worksheets() ) {

    my ( $row_min, $row_max ) = $worksheet->row_range();
    my ( $col_min, $col_max ) = $worksheet->col_range();

    for my $row ( $row_min .. $row_max ) {
        for my $col ( $col_min .. $col_max ) {

            my $cell = $worksheet->get_cell( $row, $col );
            next unless $cell;

            print "Row, Col          = ($row, $col)\n";
            print "Value             = ", $cell->value(),       "\n";
            print "Unformatted Value = ", $cell->unformatted(), "\n";

            if ( my $rich = $cell->get_rich_text() ) {
                # Multiple formats inside one cell
                print "     STRIKEOUT ->   ";                
                my $pos = 0;
                for my $rich_elem (@$rich) {
                    my ($char_pos, $font) = @$rich_elem;
                    if ($font->{Strikeout}) {
                        while ($pos++ < $char_pos) {
                            print " ";
                        }
                    } else {
                        while ($pos++ <= $char_pos) {
                            print "^";
                        }
                    }
                }
                print "\n";

            } else {
                # Entire cell has same format
                my $format = $cell->get_format();
                my $is_strikeout = $format->{Font}->{Strikeout};
                if ($is_strikeout) {
                    print "     STRIKEOUT ->   ";
                    print "^"x(length($cell->unformatted()));
                    print "\n";
                }
                print "\n";
            }
        }
    }
}
Row, Col          = (0, 0)
Value             = The
Unformatted Value = The

Row, Col          = (0, 1)
Value             = quick
Unformatted Value = quick

Row, Col          = (0, 2)
Value             = brown
Unformatted Value = brown

Row, Col          = (0, 3)
Value             = fox
Unformatted Value = fox

Row, Col          = (0, 4)
Value             = jumped
Unformatted Value = jumped

Row, Col          = (0, 5)
Value             = under
Unformatted Value = under
     STRIKEOUT ->   ^^^^^

Row, Col          = (0, 6)
Value             = over
Unformatted Value = over

Row, Col          = (0, 7)
Value             = the
Unformatted Value = the

Row, Col          = (0, 8)
Value             = lazy
Unformatted Value = lazy

Row, Col          = (0, 9)
Value             = dog.
Unformatted Value = dog.

Row, Col          = (1, 0)
Value             = THE QUICK BROWN FOX JUMPED UNDER OVER THE LAZY DOG.
Unformatted Value = THE QUICK BROWN FOX JUMPED UNDER OVER THE LAZY DOG.
     STRIKEOUT ->                              ^^^^^