Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Locale::Maketext doubles back-slash in parameterized wide-character string #13399

Closed
p5pRT opened this issue Nov 4, 2013 · 11 comments
Closed

Comments

@p5pRT
Copy link

p5pRT commented Nov 4, 2013

Migrated from rt.perl.org#120457 (status was 'resolved')

Searchable as RT120457$

@p5pRT
Copy link
Author

p5pRT commented Nov 4, 2013

From @ppisar

Hello,

following code misbehaves with Locale-Maketext-1.23​:

=== BEGIN ===
#!/usr/bin/perl
use strict;
use warnings;
use utf8;
binmode STDOUT, '​:utf8';

package My​::Localize;
use base 'Locale​::Maketext';
1;

package My​::Localize​::cs_cz;
use base 'My​::Localize';
our %Lexicon = (
  '[_1]foo1\n' => '[_1]bar\n',
  '[_1]foo2\n' => '[_1]aěa\n',
  'foo2\n' => 'aěa\n',
);
1;

package My;
my $lh = My​::Localize->get_handle('cs_CZ') or die "No lexicon exists!";

print $lh->maketext('[_1]foo1\n', 'arg'), "\n";
print $lh->maketext('[_1]foo2\n', 'arg'), "\n";
print $lh->maketext('foo2\n'), "\n";
=== END ===

petr@​dhcp-0-146​:/tmp/Locale-Maketext-1.22 $ perl -Ilib /tmp/maketextbackslash
argbar\n
argaěa\n
aěa\n

petr@​dhcp-0-146​:/tmp/Locale-Maketext-1.23 $ perl -Ilib /tmp/maketextbackslash
argbar\n
argaěa\\n
aěa\n

Here you can see this the lexicon entry​:

  '[_1]foo2\n' => '[_1]aěa\n'

gets the back-slash double-quoted because of wide-character 'ě' in the
translation and an argument substition.

This is regression between Locale-Maketext 1.22 and 1.23 due to this patch​:

commit 1735f6f
Author​: Brian Carlson <brian.carlson@​cpanel.net>
Date​: Wed Nov 28 08​:54​:33 2012 -0500

  Fix misparsing of maketext strings.
 
  Case 61251​: This commit fixes a misparse of maketext strings that could
  lead to arbitrary code execution. Basically, maketext was compiling
  bracket notation into functions, but neglected to escape backslashes
  inside the content or die on fully-qualified method names when
  generating the code. This change escapes all such backslashes and dies
  when a method name with a colon or apostrophe is specified.

-- Petr

@p5pRT
Copy link
Author

p5pRT commented Nov 4, 2013

From @ppisar

On 2013-11-04, Petr Pisar <perlbug-followup@​perl.org> wrote​:

Here you can see this the lexicon entry​:

'\[\_1\]foo2\\n' => '\[\_1\]aě\\n'

gets the back-slash double-quoted because of wide-character 'ě' in the
translation and an argument substition.

And there is another case without wide character​:

  "[_1]foo\\n\n" => "[_1]bar\\n\n",

This one prints​:

argbar\\n

The back-slashed is doubled again.

-- Petr

@p5pRT
Copy link
Author

p5pRT commented Nov 5, 2013

From @ppisar

On 2013-11-04, Petr Pisar <ppisar@​redhat.com> wrote​:

On 2013-11-04, Petr Pisar <perlbug-followup@​perl.org> wrote​:

Here you can see this the lexicon entry​:

'\[\_1\]foo2\\n' => '\[\_1\]aě\\n'

gets the back-slash double-quoted because of wide-character 'ě' in the
translation and an argument substition.

And there is another case without wide character​:

"\[\_1\]foo\\\\n\\n" => "\[\_1\]bar\\\\n\\n"\,

This one prints​:

argbar\\n

The back-slashed is doubled again.

I think I located the bug. Each back-slash gets escaped newly because of
the patch and then in the branch

if($1 eq '[' or $1 eq '') {
  if($in_group) {
  else {
  if(length $c[-1]) {
  # Now actually processing the preceding literal
  }
  }
}

a decision whether the literal is safe to pass it as a eval-ed literal
or to pass it as an argument is made. And the decision changes if the
literal contains a new line or non-ASCII characters​:

if(length $c[-1]) {
  # Now actually processing the preceding literal
  $big_pile .= $c[-1];
  if($USE_LITERALS and (
  (ord('A') == 65)
  ? $c[-1] !~ m/[^\x20-\x7E]/s
  # ASCII very safe chars
  : $c[-1] !~ m/[^ !"\#\$%&'()*+,\-.\/0-9​:;<=>?\@​A-Z[\\\]^_`a-z{|}\x07]/s
  # EBCDIC very safe chars
  )) {
  # normal case -- all very safe chars
  $c[-1] =
s/'/\\'/g;
  push @​code, q{ '} . $c[-1] . "',\n";
  $c[-1] = ''; # reuse this slot
  }
  else {
  push @​code, ' $c[' . $#c . "],\n";
  push @​c, ''; # new chunk
  }
}

I guess the back-slash escaping should be moved to the the first branch
(very safe chars) or the text should be de-escaped in else branch again.

-- Petr

@p5pRT
Copy link
Author

p5pRT commented Nov 5, 2013

From @ppisar

Dne Út 05.lis.2013 00​:33​:39, ppisar napsal(a)​:

I guess the back-slash escaping should be moved to the the first
branch
(very safe chars) or the text should be de-escaped in else branch
again.

Attached patch implements the second idea.

-- Petr

@p5pRT
Copy link
Author

p5pRT commented Nov 5, 2013

From @ppisar

0001-Do-not-mangle-Locale-Maketext-translations-with-eval.patch
From 1f7b4a987d84da987c7d7091e1d998b8f354db42 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Petr=20P=C3=ADsa=C5=99?= <ppisar@redhat.com>
Date: Tue, 5 Nov 2013 10:32:16 +0100
Subject: [PATCH] Do not mangle Locale::Maketext translations with
 eval-non-safe characters
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This commit:

From 1735f6f53ca19f99c6e9e39496c486af323ba6a8 Mon Sep 17 00:00:00 2001
From: Brian Carlson <brian.carlson@cpanel.net>
Date: Wed, 28 Nov 2012 08:54:33 -0500
Subject: [PATCH] Fix misparsing of maketext strings.

Case 61251: This commit fixes a misparse of maketext strings that could
lead to arbitrary code execution.  Basically, maketext was compiling
bracket notation into functions, but neglected to escape backslashes
inside the content or die on fully-qualified method names when
generating the code.  This change escapes all such backslashes and dies
when a method name with a colon or apostrophe is specified.

started to escape all back-slashes which breaks case when lexicon
translations contain substition and literals with eval-non-safe
characters. E.g.  these translations:

"[_1]foo\\n\n" => "[_1]bar\\n\n",
'[_1]foo\n' => '[_1]aěa\n',

got doubled back-slashes on the maketext() output.

This patch de-escapes escaped backslashes if the literal is compiled
as function argument.

Fixes #120457.

Signed-off-by: Petr Písař <ppisar@redhat.com>
---
 dist/Locale-Maketext/lib/Locale/Maketext.pm |  1 +
 dist/Locale-Maketext/t/91_backslash.t       | 35 +++++++++++++++++++++++++++++
 2 files changed, 36 insertions(+)
 create mode 100644 dist/Locale-Maketext/t/91_backslash.t

diff --git a/dist/Locale-Maketext/lib/Locale/Maketext.pm b/dist/Locale-Maketext/lib/Locale/Maketext.pm
index a21d679..948361d 100644
--- a/dist/Locale-Maketext/lib/Locale/Maketext.pm
+++ b/dist/Locale-Maketext/lib/Locale/Maketext.pm
@@ -570,6 +570,7 @@ sub _compile {
                             $c[-1] = ''; # reuse this slot
                         }
                         else {
+                            $c[-1] =~ s/\\\\/\\/g;
                             push @code, ' $c[' . $#c . "],\n";
                             push @c, ''; # new chunk
                         }
diff --git a/dist/Locale-Maketext/t/91_backslash.t b/dist/Locale-Maketext/t/91_backslash.t
new file mode 100644
index 0000000..9fe9082
--- /dev/null
+++ b/dist/Locale-Maketext/t/91_backslash.t
@@ -0,0 +1,35 @@
+#!/usr/bin/perl -Tw
+
+use strict;
+use Test::More tests => 6;
+
+BEGIN {
+    use_ok( 'Locale::Maketext' );
+}
+
+use utf8;
+
+{
+    package My::Localize;
+    our @ISA = ('Locale::Maketext');
+}
+{
+    package My::Localize::cs_cz;
+    our @ISA = ('My::Localize');
+    our %Lexicon = (
+        '[_1]foo1\n' => '[_1]bar\n',
+        '[_1]foo2\n' => '[_1]běr\n',
+        'foo2\n' => 'aěa\n',
+        "[_1]foo\\n\n" => "[_1]bar\\n\n",
+    );
+    keys %Lexicon; # dodges the 'used only once' warning
+}
+
+my $lh = My::Localize->get_handle('cs_cz');
+isa_ok( $lh, 'My::Localize::cs_cz' );
+is( $lh->maketext('[_1]foo1\n', 'arg'), 'argbar\n', 'Safe parametrized' );
+is( $lh->maketext("[_1]foo\\n\n", 'arg'), "argbar\\n\n",
+    'new line parametrized' );
+is( $lh->maketext('foo2\n'), 'aěa\n', 'Unicode literal' );
+is( $lh->maketext('[_1]foo2\n', 'arg'), 'argběr\n', 'Unicode parametrized' );
+
-- 
1.8.3.1

@p5pRT
Copy link
Author

p5pRT commented Nov 8, 2013

From @jkeenan

Applied to blead in commit 35e037e.

The patch didn't apply cleanly, so I had to do a bit of work. In the course of this work, I​:

a. trimmed the commit message somewhat;
b. corrected some spelling errors;
c. re-ordered some unit tests to be visually consistent with the order of keys in %Lexicon
d. added new test file to MANIFEST;
e. bumped $VERSION

I am taking this ticket for the purpose of closing it in about 3 days unless there is serious objection.

Thank you very much.
Jim Keenan

@p5pRT
Copy link
Author

p5pRT commented Nov 8, 2013

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Nov 8, 2013

@jkeenan - Status changed from 'open' to 'resolved'

@p5pRT
Copy link
Author

p5pRT commented Nov 8, 2013

@jkeenan - Status changed from 'resolved' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Nov 11, 2013

From @jkeenan

On Thu Nov 07 17​:31​:39 2013, jkeenan wrote​:

I am taking this ticket for the purpose of closing it in about 3 days
unless there is serious objection.

Closing as per schedule.

Thank you very much.
Jim Keenan

@p5pRT
Copy link
Author

p5pRT commented Nov 11, 2013

@jkeenan - Status changed from 'open' to 'resolved'

@p5pRT p5pRT closed this as completed Nov 11, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant