Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Safe.pm problem when regex does an implicit 'use utf8' #9123

Closed
p5pRT opened this issue Nov 18, 2007 · 11 comments
Closed

Safe.pm problem when regex does an implicit 'use utf8' #9123

p5pRT opened this issue Nov 18, 2007 · 11 comments

Comments

@p5pRT
Copy link

p5pRT commented Nov 18, 2007

Migrated from rt.perl.org#47576 (status was 'resolved')

Searchable as RT47576$

@p5pRT
Copy link
Author

p5pRT commented Nov 18, 2007

From greg@turnstep.com

Using Safe.pm uncovered an implicit 'use utf8' when a regex
/i flag is used. This is blocked because of the 'require' opcode.
There seems to be no way around this other than issuing
permit('require'), which pretty much is not a good Safe practice.
Workarounds welcome.

Sample program, using perl 5.8.8​:

#!perl

use strict;
use warnings;
use utf8;
use Safe;

my $cmp = new Safe;

$cmp->reval('/ać/i');

printf $@​ ? "Error was $@​\n" : "Okay\n";

$cmp->reval('/ac/i');

printf $@​ ? "Error was $@​\n" : "Okay\n";

__DATA__

## Output is​:

Error was 'require' trapped by operation mask at (eval 2) line 1.

Okay


Flags​:
  category=core
  severity=medium

@p5pRT
Copy link
Author

p5pRT commented Nov 18, 2007

From @eserte

Greg Sabino Mullane (via RT) <perlbug-followup@​perl.org> writes​:

# New Ticket Created by Greg Sabino Mullane
# Please include the string​: [perl #47576]
# in the subject line of all future correspondence about this issue.
# <URL​: http​://rt.perl.org/rt3/Ticket/Display.html?id=47576 >

Using Safe.pm uncovered an implicit 'use utf8' when a regex
/i flag is used. This is blocked because of the 'require' opcode.
There seems to be no way around this other than issuing
permit('require'), which pretty much is not a good Safe practice.
Workarounds welcome.

This still fails with perl5.10.0 RC1.

For a workaround, throw something into your code which triggers
loading of some unicode-related .pl files (utf8_heavy.pl,
unicore/To/Fold.pl etc.) at compile time, e.g. with

  qr{\x{0100}}i;

Note that the unicode codepoint must be 0x100 or higher.

Regards,
  Slaven

Sample program, using perl 5.8.8​:

#!perl

use strict;
use warnings;
use utf8;
use Safe;

my $cmp = new Safe;

$cmp->reval('/ać/i');

printf $@​ ? "Error was $@​\n" : "Okay\n";

$cmp->reval('/ac/i');

printf $@​ ? "Error was $@​\n" : "Okay\n";

__DATA__

## Output is​:

Error was 'require' trapped by operation mask at (eval 2) line 1.

Okay

---
Flags​:
category=core
severity=medium

--
Slaven Rezic - slaven <at> rezic <dot> de
  BBBike - route planner for cyclists in Berlin
  WWW version​: http​://www.bbbike.de
  Perl/Tk version for Unix and Windows​: http​://bbbike.sourceforge.net

@p5pRT
Copy link
Author

p5pRT commented Nov 18, 2007

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Nov 19, 2007

From @rgs

On 18 Nov 2007 20​:53​:43 +0100, Slaven Rezic <slaven@​rezic.de> wrote​:

Using Safe.pm uncovered an implicit 'use utf8' when a regex
/i flag is used. This is blocked because of the 'require' opcode.
There seems to be no way around this other than issuing
permit('require'), which pretty much is not a good Safe practice.
Workarounds welcome.

This still fails with perl5.10.0 RC1.

For a workaround, throw something into your code which triggers
loading of some unicode-related .pl files (utf8_heavy.pl,
unicore/To/Fold.pl etc.) at compile time, e.g. with

qr\{\\x\{0100\}\}i;

Note that the unicode codepoint must be 0x100 or higher.

Another option would be to patch Safe so that it itself requires utf8
and the rest. However we surely don't want to load all unicore files,
so some preparatory loading would be needed anyway if you use named
characters or the like. That's why I'm tempted to mark this bug as
"won't fix".

@p5pRT
Copy link
Author

p5pRT commented Nov 19, 2007

From @eserte

"Rafael Garcia-Suarez" <rgarciasuarez@​gmail.com> writes​:

On 18 Nov 2007 20​:53​:43 +0100, Slaven Rezic <slaven@​rezic.de> wrote​:

Using Safe.pm uncovered an implicit 'use utf8' when a regex
/i flag is used. This is blocked because of the 'require' opcode.
There seems to be no way around this other than issuing
permit('require'), which pretty much is not a good Safe practice.
Workarounds welcome.

This still fails with perl5.10.0 RC1.

For a workaround, throw something into your code which triggers
loading of some unicode-related .pl files (utf8_heavy.pl,
unicore/To/Fold.pl etc.) at compile time, e.g. with

qr\{\\x\{0100\}\}i;

Note that the unicode codepoint must be 0x100 or higher.

Another option would be to patch Safe so that it itself requires utf8
and the rest. However we surely don't want to load all unicore files,
so some preparatory loading would be needed anyway if you use named
characters or the like. That's why I'm tempted to mark this bug as
"won't fix".

But hopefully only for 5.10.0? I think there are other possibilities
to fix the problem, e.g. by replacing all the perl unicore stuff with
compiled code (I found it scary that the regexp engine jumps sometimes
into evaling perl code, which in turn also called the regexp
engine...), or introduce the possibility to distingish requires of
core (and maybe site?) modules and normal user modules.

Regards,
  Slaven

--
Slaven Rezic - slaven <at> rezic <dot> de

  tkrevdiff - graphical display of diffs between revisions (RCS, CVS or SVN)
  http​://ptktools.sourceforge.net/#tkrevdiff

@p5pRT
Copy link
Author

p5pRT commented Nov 20, 2007

From @nwc10

On Mon, Nov 19, 2007 at 09​:59​:59PM +0100, Slaven Rezic wrote​:

"Rafael Garcia-Suarez" <rgarciasuarez@​gmail.com> writes​:

On 18 Nov 2007 20​:53​:43 +0100, Slaven Rezic <slaven@​rezic.de> wrote​:

Using Safe.pm uncovered an implicit 'use utf8' when a regex
/i flag is used. This is blocked because of the 'require' opcode.
There seems to be no way around this other than issuing
permit('require'), which pretty much is not a good Safe practice.
Workarounds welcome.

This still fails with perl5.10.0 RC1.

For a workaround, throw something into your code which triggers
loading of some unicode-related .pl files (utf8_heavy.pl,
unicore/To/Fold.pl etc.) at compile time, e.g. with

qr\{\\x\{0100\}\}i;

Note that the unicode codepoint must be 0x100 or higher.

Another option would be to patch Safe so that it itself requires utf8
and the rest. However we surely don't want to load all unicore files,
so some preparatory loading would be needed anyway if you use named
characters or the like. That's why I'm tempted to mark this bug as
"won't fix".

Or patch the regexp engine to switch out of Safe when it saves state to
re-enter the interpreter to load swashes?
[I don't know how feasible this is]

But hopefully only for 5.10.0? I think there are other possibilities
to fix the problem, e.g. by replacing all the perl unicore stuff with
compiled code (I found it scary that the regexp engine jumps sometimes
into evaling perl code, which in turn also called the regexp
engine...), or introduce the possibility to distingish requires of
core (and maybe site?) modules and normal user modules.

This would actually please me most. I'm not sure how much processing is
really needed on the files that the regexp engine loads, and I keep wondering
whether (either) writing them out with Storable, or writing something custom
to load them would be faster and cleaner.

Nicholas Clark

@p5pRT
Copy link
Author

p5pRT commented Nov 21, 2007

From @eserte

Nicholas Clark <nick@​ccl4.org> writes​:

On Mon, Nov 19, 2007 at 09​:59​:59PM +0100, Slaven Rezic wrote​:

"Rafael Garcia-Suarez" <rgarciasuarez@​gmail.com> writes​:

On 18 Nov 2007 20​:53​:43 +0100, Slaven Rezic <slaven@​rezic.de> wrote​:

Using Safe.pm uncovered an implicit 'use utf8' when a regex
/i flag is used. This is blocked because of the 'require' opcode.
There seems to be no way around this other than issuing
permit('require'), which pretty much is not a good Safe practice.
Workarounds welcome.

This still fails with perl5.10.0 RC1.

For a workaround, throw something into your code which triggers
loading of some unicode-related .pl files (utf8_heavy.pl,
unicore/To/Fold.pl etc.) at compile time, e.g. with

qr\{\\x\{0100\}\}i;

Note that the unicode codepoint must be 0x100 or higher.

Another option would be to patch Safe so that it itself requires utf8
and the rest. However we surely don't want to load all unicore files,
so some preparatory loading would be needed anyway if you use named
characters or the like. That's why I'm tempted to mark this bug as
"won't fix".

Or patch the regexp engine to switch out of Safe when it saves state to
re-enter the interpreter to load swashes?
[I don't know how feasible this is]

Could this be a security problem if somebody changes @​INC?

use lib sub { warn "@​_" };
qr{\x{0100}}i;
__END__
CODE(0x529aa8) utf8.pm at /tmp/bla.pl line 16.
CODE(0x529aa8) utf8_heavy.pl at /tmp/bla.pl line 16.
CODE(0x529aa8) unicore/PVA.pl at /tmp/bla.pl line 16.
CODE(0x529aa8) unicore/Exact.pl at /tmp/bla.pl line 16.
CODE(0x529aa8) unicore/Canonical.pl at /tmp/bla.pl line 16.
CODE(0x529aa8) unicore/To/Fold.pl at /tmp/bla.pl line 16.

But hopefully only for 5.10.0? I think there are other possibilities
to fix the problem, e.g. by replacing all the perl unicore stuff with
compiled code (I found it scary that the regexp engine jumps sometimes
into evaling perl code, which in turn also called the regexp
engine...), or introduce the possibility to distingish requires of
core (and maybe site?) modules and normal user modules.

This would actually please me most. I'm not sure how much processing is
really needed on the files that the regexp engine loads, and I keep wondering
whether (either) writing them out with Storable, or writing something custom
to load them would be faster and cleaner.

Regards,
  Slaven

--
Slaven Rezic - slaven <at> rezic <dot> de

  tktimex - time recording tool
  http​://sourceforge.net/projects/ptktools/

@p5pRT
Copy link
Author

p5pRT commented Nov 21, 2007

From @nwc10

On Wed, Nov 21, 2007 at 09​:51​:19PM +0100, Slaven Rezic wrote​:

Nicholas Clark <nick@​ccl4.org> writes​:

Or patch the regexp engine to switch out of Safe when it saves state to
re-enter the interpreter to load swashes?
[I don't know how feasible this is]

Could this be a security problem if somebody changes @​INC?

use lib sub { warn "@​_" };
qr{\x{0100}}i;
__END__
CODE(0x529aa8) utf8.pm at /tmp/bla.pl line 16.
CODE(0x529aa8) utf8_heavy.pl at /tmp/bla.pl line 16.
CODE(0x529aa8) unicore/PVA.pl at /tmp/bla.pl line 16.
CODE(0x529aa8) unicore/Exact.pl at /tmp/bla.pl line 16.
CODE(0x529aa8) unicore/Canonical.pl at /tmp/bla.pl line 16.
CODE(0x529aa8) unicore/To/Fold.pl at /tmp/bla.pl line 16.

Yes. Good point.
Unless you swapped out to the @​INC from outside the Safe compartment.

Nicholas Clark

@p5pRT
Copy link
Author

p5pRT commented Jan 27, 2013

From @khwilliamson

This was fixed by commit 9006651
Author​: Tim Bunce <Tim.Bunce@​pobox.com>
Date​: Sun Feb 21 17​:39​:55 2010 +0100

  [perl #72942] Can't perform unicode operations in Safe compartment
 
  The fix is to make Safe load utf8.pm (and ensure utf8_heavy.pl is run)
  so it can always share utf8​::SWASHNEW.

--
Karl Williamson

@p5pRT
Copy link
Author

p5pRT commented Jan 27, 2013

From [Unknown Contact. See original ticket]

This was fixed by commit 9006651
Author​: Tim Bunce <Tim.Bunce@​pobox.com>
Date​: Sun Feb 21 17​:39​:55 2010 +0100

  [perl #72942] Can't perform unicode operations in Safe compartment
 
  The fix is to make Safe load utf8.pm (and ensure utf8_heavy.pl is run)
  so it can always share utf8​::SWASHNEW.

--
Karl Williamson

@p5pRT
Copy link
Author

p5pRT commented Jan 27, 2013

@khwilliamson - Status changed from 'open' to 'resolved'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant