Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[engineering.redhat.com #385443][CVE-2016-2381] security issue with multiple environment entries with the same name #15118

Closed
p5pRT opened this issue Jan 4, 2016 · 84 comments

Comments

@p5pRT
Copy link

p5pRT commented Jan 4, 2016

Migrated from rt.perl.org#127158 (status was 'resolved')

Searchable as RT127158$

@p5pRT
Copy link
Author

p5pRT commented Jan 4, 2016

From @stephane-chazelas

Hello,

this is a follow-up on an issue I reported last year to the
Debian security list and the sudo maintainer. As agreed with
the Debian seclist, I'm widening the discussion to maintainers
of software affected by that issue (or at least software best
placed to mitigate it)​: perl, yash, bash, dash, mksh, fish.
Debian, Suse and Redhat seclists are included also in their
quality of security point of contacts for GNU libc (non-GNU
systems are also affected though).

It is a security issue in that it offers a way for attackers to
bypass some sanitizing made to the environment. As noted by
Florian Weimer, it's already partly known
(https://www.securecoding.cert.org/confluence/display/c/ENV02-C.+Beware+of+multiple+environment+variables+with+the+same+effective+name),

(issue tested on a current Debian testing system).

The issue​:

The execve(file, argp[], envp[]) system call executes "file" and
passes two lists of null-terminated strings. The second one is
used to pass environment variables. By convention those strings
are in the format "VARNAME=VARVALUE".

There is typically no sanitizing done by the libc or the kernel
between that list being passed to execve() and the executed
command receiving it in its environ[] (except for specific env
vars blocked by glibc under AT_SECURE).

That envp[] may contain strings that don't follow that
convention, or, and that's what's the issue here, nothing
prevents one to pass a "VAR=value" *and* a "VAR=other-value".

The essence of the problem is that when that happens, executed
applications behave differently allowing env vars with the wrong
value to get through.

To test, you can compile this simple executable​:

#include <unistd.h>
#include <stdio.h>
#include <string.h>
int main(int argc, char* argv[])
{
  int i;
  for (i = 2; strcmp(argv[i], "EOA"); i++);
  argv[i++] = 0;
  execve(argv[1], &argv[1], &argv[i]);
  perror(argv[1]);
  return 1;
}

Which is just a wrapper around execve(2).

Called as​:

$ ./execve /bin/bash -c 'echo "$a"; env' EOA a=1 a=2 a=3
3
PWD=/tmp
a=3
SHLVL=1
_=/usr/bin/env

You can see here that bash sets its "a" variable from the *last*
entry, and removes all the other ones.

Other shells, languages or libraries dealing with the
environment behave differently.

The bash, dash, mksh, fish, rc, yash shells and perl use the
*last* entry. ksh93, zsh and most other non-shells (libc's
getenv, python, ruby, tcl...) use the *first* entry.

Among shells, only zsh and yash preserve the other entries.

yash and perl have a bug in that even though they use the
*last* entry on input, upon assigning a variable, they update
the first entry (also, perl uses the last entry for %ENV but the
library it uses will typically use the first entry)

$ ./execve /usr/bin/perl -le 'print $ENV{a}; $ENV{a} = "x";system("env")' EOA a=1 a=2 a=3
3
a=x
a=2
a=3
$ ./execve /usr/bin/yash -c 'echo "$a"; a=x; /usr/bin/env' EOA a=1 a=2 a=3
3
a=x
a=2
a=3
PWD=/tmp

Other tools I tested are consistent on that aspect.

When assigning a new value to an env var (for example with
setenv()) that had multiple entries, with those tools that don't
remove duplicates, only the first entry is updated.

What that means for instance is that on those systems where sh
is based on bash, dash or pdksh (that is most free software
systems) a code (for instance for a setuid executable) that does​:

  setenv("PATH=/bin​:/sbin",1); /* sane value for PATH */
  system("uname");

Would be vulnerable, because if called with "PATH=/tmp/evil",
"PATH=/tmp/evil", that setenv() would only update the *first*
PATH variable, while the "sh" invoked by system() would use the
*last* one.

In the case of perl, that allows bypassing the taint mode (set
when perl is executed in setuids) in​:

$ ln -s /usr/bin/whoami uname
$ ./execve /usr/bin/perl -Te 'system("uname;")' EOA PATH="$PWD" PATH="$PWD"
Insecure $ENV{PATH} while running with -T switch at -e line 1.

Good, perl didn't let us call uname with an unsafe PATH, but​:

$ ./execve /usr/bin/perl -Te '$ENV{PATH}="/bin";system("uname;")' EOA PATH="$PWD" PATH="$PWD"
chazelas

Here, our "evil" (a symlink to whoami) uname was called, because
the second PATH variable was used.

For sudo, that affected variables passed along by sudo (like
LC_*) until Todd added some sanitizing​:
https://www.sudo.ws/repos/sudo/rev/d4dfb05db5d7 (preserves only
the first instance).

IMO, here is what should be done​:

- fix perl and yash so they get the first entry for a variable
- bash, dash, mksh, fish changed so that they get the first
  entry as well to be consistent with putenv/setenv that update
  the first.
- add some hardening to the libc so that under AT_SECURE,
  entries with duplicate env var names are removed (only first
  one retained).
- since most shells already remove duplicates, yash and zsh
  could probably do it as well.
- one could argue that setenv, putenv, or assigning env vars in
  other languages should also remove duplicates.

Some more information as already given for the benefit of the
new participants​:

- I've verified that while one can modify the source code of an
  openssh client to make it send several copies of an
  environment variable to the server, the server will only put
  one of them (the last one) in the environment of the remote
  shell. So the exploit cannot be carried out over ssh (at least
  openssh).

- (only informational), the only case I know of an application
  passing two env vars with the same name was bash prior to
  shellshock where when you did​:

  foo() { echo foo; }; foo=bar; export foo; export -f foo; cmd

  bash would pass both foo=bar and foo="() { echo foo; }" to
  cmd. (and if cmd was a bash script, both the foo variable and
  function would be imported).

  Since shellshock, bash now passes foo=bar and
  "BASH_FUNC_foo%%=() { echo foo; }"

--
Stephane

@p5pRT
Copy link
Author

p5pRT commented Jan 5, 2016

From tg@mirbsd.de

On Mon, 4 Jan 2016 23​:21​:21 GMT, stephane.chazelas@​gmail.com wrote​:

[�]

Ouch! Good point.

- bash, dash, mksh, fish changed so that they get the first
entry

Sorry, that�s not going to happen. Assignment must stay
consistent within the shell language.

If you run�

  foo() { echo $x; }
  x=1 x=2 x=3 foo

� you expect it to show 3, not 1, too � independend of
whether x is imported, exported or neither.

I think the divergence is here that pdksh-based shells
parse the entire environ into shell variables and then
don�t use it any more so later lookup is a simple hash
table lookup and fast, whereas libc getenv(3) stops at
the first match to be fast.

I would suggest letting the kernel deduplicate environ
during exec� otherwise, making libc getenv(3) not stop
at the first match is something libc implementors will
be unhappy about but possibly have to do (does environ
have a size limit?).

- one could argue that setenv, putenv, or assigning env vars in
other languages should also remove duplicates.

This is something that, indeed, should be done, yes.

- (only informational), the only case I know of an application
[�]
bash would pass both foo=bar and foo="() { echo foo; }" to

Ah, okay.

Anyway, thanks for uncovering this� �what fun� �

bye,
//mirabilos
--
�It is inappropriate to require that a time represented as
seconds since the Epoch precisely represent the number of
seconds between the referenced time and the Epoch.�
  -- IEEE Std 1003.1b-1993 (POSIX) Section B.2.2.2

@p5pRT
Copy link
Author

p5pRT commented Jan 5, 2016

From tg@mirbsd.de

Dixi quod�

be unhappy about but possibly have to do (does environ
have a size limit?).

Yes, ARG_MAX protects us from this being a DoS vector.

during exec� otherwise, making libc getenv(3) not stop
at the first match is something libc implementors will

Fun fact​: making getenv/putenv read and set the last
item breaks Perl, because Perl�s env import loop assigns
the read environment variables to %ENV while the latter
being magical already, so this�

~/execve /usr/bin/perl -le 'print $ENV{a}; $ENV{a} = "x"; print $ENV{a}; system("env")' EOA a=1 a=2 a=3 a=4 a=5

� always prints the second-to-last value. Similarily,
from reading the code, it will react allergically to
shrinking environ during that loop (due to duplicate
removal).

- one could argue that setenv, putenv, or assigning env vars in
other languages should also remove duplicates.

This is something that, indeed, should be done, yes.

This cannot be done, thus, without breaking existing
binaries.

Then I see only one way out of this misery​: make the
C startup (crt) remove the duplicates. Or the kernel,
since a csu fix may not catch all cases (asm binaries)
and would need to be replicated e.g. in ld.so as well.
(Using MirBSD terms here.)

In the meantime, I studied what POSIX has to say; it
doesn�t touch on duplicate keys but suggests implementations
to use a hashtable for getenv() for speed anyway (unless
the application uses putenv()).

(Note​: I looked at Perl 5.8.8; later versions may or
may not differ.)

Although, one thing that *could* be done is to set
the _first_ entry and garbage-collect any later ones
in setenv/putenv, but if applications break like this
I fear that may trigger bugs in other applications and
believe normalisation in kernel/csu to be the best fix
despite the startup cost for all applications.

bye,
//mirabilos
--
(gnutls can also be used, but if you are compiling lynx for your own use,
there is no reason to consider using that package)
  -- Thomas E. Dickey on the Lynx mailing list, about OpenSSL

@p5pRT
Copy link
Author

p5pRT commented Jan 5, 2016

From tg@mirbsd.de

Dixi quod�

Although, one thing that *could* be done is to set
the _first_ entry and garbage-collect any later ones
in setenv/putenv

No, this won�t work either; Perl with -DPERL_USE_SAFE_PUTENV
would use the first value then.

So, short of normalising the environment before any
user code is run, there�s no way to plug this problem
without causing further breakage, unfortunately.

(That being said, I found a couple of bugs in MirBSD
libc�s env code, now I looked at it closely. Thanks.)

bye,
//mirabilos
--
I believe no one can invent an algorithm. One just happens to hit upon it
when God enlightens him. Or only God invents algorithms, we merely copy them.
If you don't believe in God, just consider God as Nature if you won't deny
existence. -- Coywolf Qi Hunt

@p5pRT
Copy link
Author

p5pRT commented Jan 5, 2016

From @stephane-chazelas

2016-01-05 15​:57​:57 +0000, Thorsten Glaser​:
[...]

- bash, dash, mksh, fish changed so that they get the first
entry

Sorry, that�s not going to happen. Assignment must stay
consistent within the shell language.

If you run�

foo\(\) \{ echo $x; \}
x=1 x=2 x=3 foo

� you expect it to show 3, not 1, too � independend of
whether x is imported, exported or neither.

Hi Thorsten,

That's something different.

Of course in the shell command line​:

x=1 x=2 x=3 foo

we've got 3 variable assignments for the same variable, so one
overwrites the previous one. In the case of an external foo
command, we end up to​:

execve("foo", ["foo"], ["x=3"])

not

execve("foo", ["foo"], ["x=1", "x=2", "x=3"])

(note that the Bourne shell (and IIRC earlier versions of ash)
parsed assignments from right to left, so you'd end up with
"x=1" with those shells)

What I'm talking of here is the processing of the environment
the shell receives on startup. There, having more than one entry
for an environment variable is a pathological case, variables
are only meant to have *one* value. See how most shells
(including pdksh/mksh do remove the duplicates). Shells that
support exporting arrays (like rc, es, fish) do *not* do it by
passing several a=1 a=2 (I suppose they can't rely on the order
being preserved).

zsh and ksh93 already do the right thing.

The problem I'm talking of here (at least the particular attack
vector I'm mentioning) is only because some shells (and perl)
use the environment differently from the rest (from
getenv/setenv to start with).

If the problem is fixed (shells assign $foo from the *first*
"foo=whatever" it receives and ignore (like zsh) or discard
(like ksh93) further entries for that same variable (and
everything else does the same), the problem goes away.

Now that you mention hash tables, I wonder if putenv/setenv
getting and updating the first entry and do not reorder the
entries is the norm or not.

--
Stephane

@p5pRT
Copy link
Author

p5pRT commented Jan 6, 2016

From @tonycoz

On Mon Jan 04 15​:22​:04 2016, stephane.chazelas@​gmail.com wrote​:

- fix perl and yash so they get the first entry for a variable

The first of the attached patches does that by reversing the order
we initialize %ENV.

- one could argue that setenv, putenv, or assigning env vars in
other languages should also remove duplicates.

The second patch attempts to remove duplicates of an environment variable
when it's set.

Tony

@p5pRT
Copy link
Author

p5pRT commented Jan 6, 2016

From @tonycoz

0001-perl-127158-reverse-initialization-of-ENV.patch
From 9db49ead97699afd2030ae0c121a9ae7eb4a2a2f Mon Sep 17 00:00:00 2001
From: Tony Cook <tony@develop-help.com>
Date: Wed, 6 Jan 2016 10:30:08 +1100
Subject: [perl #127158] reverse initialization of %ENV

On most platforms, if there are duplicate environment variables,
getenv() and setenv() get/modify the first entry found, make %ENV
consistent with that.
---
 perl.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/perl.c b/perl.c
index b8d98ff..d33def9 100644
--- a/perl.c
+++ b/perl.c
@@ -4319,8 +4319,11 @@ S_init_postdump_symbols(pTHX_ int argc, char **argv, char **env)
 	if (env) {
 	  char *s, *old_var;
 	  SV *sv;
-	  for (; *env; env++) {
-	    old_var = *env;
+          char **env_start = env;
+          while (*env)
+              ++env;
+	  while (env > env_start) {
+	    old_var = *--env;
 
 	    if (!(s = strchr(old_var,'=')) || s == old_var)
 		continue;
@@ -4330,10 +4333,10 @@ S_init_postdump_symbols(pTHX_ int argc, char **argv, char **env)
 	    (void)strupr(old_var);
 	    *s = '=';
 #endif
-	    sv = newSVpv(s+1, 0);
-	    (void)hv_store(hv, old_var, s - old_var, sv, 0);
-	    if (env_is_not_environ)
-	        mg_set(sv);
+            sv = newSVpv(s+1, 0);
+            (void)hv_store(hv, old_var, s - old_var, sv, 0);
+            if (env_is_not_environ)
+                mg_set(sv);
 	  }
       }
 #endif /* USE_ENVIRON_ARRAY */
-- 
2.1.4

@p5pRT
Copy link
Author

p5pRT commented Jan 6, 2016

From @tonycoz

0002-perl-127158-remove-duplicates-in-my_setenv.patch
From 57bed4c34abe49405a03403cf3d4d26699ec0ad0 Mon Sep 17 00:00:00 2001
From: Tony Cook <tony@develop-help.com>
Date: Wed, 6 Jan 2016 11:43:18 +1100
Subject: [perl #127158] remove duplicates in my_setenv()

It's possible for environ to contain duplicate definitions of an
environment variable, to ensure a child process sees only the
definition we remove any duplicates when a given variable is set.

The implementations of unsetenv() I've tested removed all of the
definitions, but POSIX is silent on how duplicate names are handled,
so explicitly unsetenv() the name until we're sure it's gone.

For the PERL_USE_SAFE_PUTENV case without unsetenv there's not really
anything we can do.

This needs tests, assuming I can find a way to to them.
---
 util.c | 74 ++++++++++++++++++++++++++++++++++++++++++------------------------
 1 file changed, 47 insertions(+), 27 deletions(-)

diff --git a/util.c b/util.c
index 17b62dd..8f4514b 100644
--- a/util.c
+++ b/util.c
@@ -2148,30 +2148,47 @@ Perl_my_setenv(pTHX_ const char *nam, const char *val)
             tmpenv[max] = NULL;
             environ = tmpenv;               /* tell exec where it is now */
         }
-        if (!val) {
-            safesysfree(environ[i]);
-            while (environ[i]) {
-                environ[i] = environ[i+1];
-                i++;
+        if (val) {
+            if (!environ[i]) {                 /* does not exist yet */
+                environ = (char**)safesysrealloc(environ, (i+2) * sizeof(char*));
+                environ[i+1] = NULL;    /* make sure it's null terminated */
+            }
+            else
+                safesysfree(environ[i]);
+            nlen = strlen(nam);
+            vlen = strlen(val);
+
+            environ[i] = (char*)safesysmalloc((nlen+vlen+2) * sizeof(char));
+            /* all that work just for this */
+            my_setenv_format(environ[i], nam, nlen, val, vlen);
+
+            /* look for a duplicate entry, if any */
+            for (i++; environ[i]; i++) {
+                if (strnEQ(environ[i],nam,len) && environ[i][len] == '=')
+                    break;
             }
-#ifdef __amigaos4__
-            goto my_setenv_out;
-#else
-            return;
-#endif
-        }
-        if (!environ[i]) {                 /* does not exist yet */
-            environ = (char**)safesysrealloc(environ, (i+2) * sizeof(char*));
-            environ[i+1] = NULL;    /* make sure it's null terminated */
         }
-        else
+
+        if (environ[i]) { /* environ[i] is a match */
+            I32 from = i+1;
+
+            assert(strnEQ(environ[i],nam,len) && environ[i][len] == '=');
+
             safesysfree(environ[i]);
-        nlen = strlen(nam);
-        vlen = strlen(val);
 
-        environ[i] = (char*)safesysmalloc((nlen+vlen+2) * sizeof(char));
-        /* all that work just for this */
-        my_setenv_format(environ[i], nam, nlen, val, vlen);
+            /* remove any duplicate definitions */
+            while (environ[from]) {
+                if (strnEQ(environ[from],nam,len) && environ[from][len] == '=') {
+                    safesysfree(environ[from]);
+                    ++from;
+                }
+                else {
+                    environ[i++] = environ[from++];
+                }
+            }
+            /* copy the NULL */
+            environ[i] = environ[from];
+        }
     } else {
 # endif
     /* This next branch should only be called #if defined(HAS_SETENV), but
@@ -2180,9 +2197,10 @@ Perl_my_setenv(pTHX_ const char *nam, const char *val)
     */
 #   if defined(__CYGWIN__)|| defined(__SYMBIAN32__) || defined(__riscos__) || (defined(__sun) && defined(HAS_UNSETENV))
 #       if defined(HAS_UNSETENV)
-        if (val == NULL) {
-            (void)unsetenv(nam);
-        } else {
+        /* ensure any duplicates are removed */
+        while (PerlEnv_getenv(nam))
+            unsetenv(nam);
+        if (val) {
             (void)setenv(nam, val, 1);
         }
 #       else /* ! HAS_UNSETENV */
@@ -2190,10 +2208,12 @@ Perl_my_setenv(pTHX_ const char *nam, const char *val)
 #       endif /* HAS_UNSETENV */
 #   else
 #       if defined(HAS_UNSETENV)
-        if (val == NULL) {
-            if (environ) /* old glibc can crash with null environ */
-                (void)unsetenv(nam);
-        } else {
+        /* ensure any duplicates are removed */
+        if (environ) {  /* old glibc can crash with null environ */
+            while (PerlEnv_getenv(nam))
+                unsetenv(nam);
+        }
+        if (val) {
 	    const int nlen = strlen(nam);
 	    const int vlen = strlen(val);
 	    char * const new_env =
-- 
2.1.4

@p5pRT
Copy link
Author

p5pRT commented Jan 6, 2016

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Jan 6, 2016

From krahmer@suse.com

On Tue, Jan 05, 2016 at 10​:03​:47PM +0000, Stephane CHAZELAS wrote​:

[...]

The problem I'm talking of here (at least the particular attack
vector I'm mentioning) is only because some shells (and perl)
use the environment differently from the rest (from
getenv/setenv to start with).

If the problem is fixed (shells assign $foo from the *first*
"foo=whatever" it receives and ignore (like zsh) or discard
(like ksh93) further entries for that same variable (and
everything else does the same), the problem goes away.

Now that you mention hash tables, I wonder if putenv/setenv
getting and updating the first entry and do not reorder the
entries is the norm or not.

Worse. putenv/setenv reallocate and move stuff across the heap,
resetting __environ and may leave envp[] argument of main()
out of sync. (think threads)

I think the only thing we can do is sticking to what
POSIX says about getenv/putenv etc. and making perl etc.
aware that dups may happen and remove them on init (not ignoring them! that
already caused root exploits via openbsd rtld double LD_PRELOAD).

glibc properly handles insecure dups on AT_SECURE. Maybe POSIX allows
to erase dups for any startup by ld.so, which would be prefered.

-s

--

~ perl self.pl
~ $_='print"\$_=\47$_\47;eval"';eval
~ krahmer@​suse.com - SuSE Security Team

@p5pRT
Copy link
Author

p5pRT commented Jan 6, 2016

From @stephane-chazelas

2016-01-06 11​:00​:42 +0100, Sebastian Krahmer​:
[...]

Worse. putenv/setenv reallocate and move stuff across the heap,
resetting __environ and may leave envp[] argument of main()
out of sync. (think threads)
[...]

But can any of the env entries that a program *received* (as
opposed to the ones added later with setenv/putenv) be reordered
with glibc (or other libcs)? I've not managed to.

Can setenv("x", "value") update anything else than the *first*
entry? Or that upon the next execve(), that "x=value" may not be
the first one?

I think the only thing we can do is sticking to what
POSIX says about getenv/putenv etc. and making perl etc.
aware that dups may happen and remove them on init (not ignoring them! that
already caused root exploits via openbsd rtld double LD_PRELOAD).
[...]

Well, the problem is that in​:

setenv("PATH=sane-value");
execlp("perl-script");
or system("cmd");

setenv updates only one of the PATH entries (in my experience
the first), but perl or sh use a different one. Even if perl is
modified to use only one and ignore the rest (and most sh do
that), it has to be the same one as set by setenv().

Here, if we were to cast blame, that would partly on setenv()
that fails to update *all* the PATH entries (or set one PATH env
var and remove the other ones), or on bash/mksh/dash/fish that
get a different PATH from the one set by setenv() (or the
kernel's execve() for not removing duplicates).

glibc properly handles insecure dups on AT_SECURE.
[...]

What do you mean?

They don't seem to be removed here (Linux Mint 17.3) for a
setuid executable​:

$ ll env
-rwsr-x--- 1 root chazelas 27232 Nov 27 2014 env*
$ ./execve ./env EOA PATH=/tmp/evil PATH=/tmp/evil
PATH=/tmp/evil
PATH=/tmp/evil
$ ./execve ./env PATH=/usr/bin env EOA PATH=/tmp/evil PATH=/tmp/evil
PATH=/usr/bin
PATH=/tmp/evil
$ /lib/x86_64-linux-gnu/libc-2.19.so
GNU C Library (Ubuntu EGLIBC 2.19-0ubuntu6.6) stable release version 2.19, by Roland McGrath et al.
Copyright (C) 2014 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Compiled by GNU CC version 4.8.2.
Compiled on a Linux 3.13.11 system on 2015-02-25.
Available extensions​:
  crypt add-on version 2.1 by Michael Glad and others
  GNU Libidn by Simon Josefsson
  Native POSIX Threads Library by Ulrich Drepper et al
  BIND-8.2.3-T5B
libc ABIs​: UNIQUE IFUNC
For bug reporting instructions, please see​:
<https://bugs.launchpad.net/ubuntu/+source/eglibc/+bugs>.

(according to ltrace, that "env" uses putenv(3)).

--
Stephane

@p5pRT
Copy link
Author

p5pRT commented Jan 6, 2016

From krahmer@suse.com

On Wed, Jan 06, 2016 at 01​:08​:19PM +0000, Stephane CHAZELAS wrote​:

2016-01-06 11​:00​:42 +0100, Sebastian Krahmer​:
[...]

Worse. putenv/setenv reallocate and move stuff across the heap,
resetting __environ and may leave envp[] argument of main()
out of sync. (think threads)
[...]

But can any of the env entries that a program *received* (as
opposed to the ones added later with setenv/putenv) be reordered
with glibc (or other libcs)? I've not managed to.

Can setenv("x", "value") update anything else than the *first*
entry? Or that upon the next execve(), that "x=value" may not be
the first one?

Thats implementation dependend of corse, so we have to assume
the bad case that it may be re-ordered. If not now, maybe in future
implementations.
Thats why I voted for removing (erasing) dups rather than ignoring them (and also because of
the de-sync envp[] thing).

I think the only thing we can do is sticking to what
POSIX says about getenv/putenv etc. and making perl etc.
aware that dups may happen and remove them on init (not ignoring them! that
already caused root exploits via openbsd rtld double LD_PRELOAD).
[...]

Well, the problem is that in​:

setenv("PATH=sane-value");
execlp("perl-script");
or system("cmd");

setenv updates only one of the PATH entries (in my experience
the first), but perl or sh use a different one. Even if perl is
modified to use only one and ignore the rest (and most sh do
that), it has to be the same one as set by setenv().

Yes, see above about _removing_ dups. Also note that this example
is insecure either way since best practices is to either clearenv() and only
then using setenv() or to pass the array itself to execve(). For exactly
the reason of junk/dups that may be inherited. Using a polluted environment
and then calling some setenv's never was secure. White lists vs. black lists
approach.

Here, if we were to cast blame, that would partly on setenv()
that fails to update *all* the PATH entries (or set one PATH env
var and remove the other ones), or on bash/mksh/dash/fish that
get a different PATH from the one set by setenv() (or the
kernel's execve() for not removing duplicates).

glibc properly handles insecure dups on AT_SECURE.
[...]

What do you mean?

They don't seem to be removed here (Linux Mint 17.3) for a
setuid executable​:

Removing any and all occurences of insecure env vars, including
any dups.

TL;DR​: I ack the problem of environment variable dups, but patching
perl etc. to ignore them is no solution to me. Either patch perl etc.
to remove the dups on early startup so they cant do harm, or "enhance"
ld.so to do so (if POSIX and ABI permits that).

Sebastian

--

~ perl self.pl
~ $_='print"\$_=\47$_\47;eval"';eval
~ krahmer@​suse.com - SuSE Security Team

@p5pRT
Copy link
Author

p5pRT commented Jan 6, 2016

From tg@mirbsd.de

Sebastian Krahmer dixit​:

Worse. putenv/setenv reallocate and move stuff across the heap,
resetting __environ and may leave envp[] argument of main()
out of sync. (think threads)

POSIX says that​:
� environment is not thread-safe at all anyway
� envp out-of-sync is to be expected
� applications shall use getenv/setenv/unsetenv,
  but the historic usage of looking at environ[]
  is still allowed

glibc properly handles insecure dups on AT_SECURE. Maybe POSIX allows
to erase dups for any startup by ld.so, which would be prefered.

POSIX doesn�t say a single word about duplicates (I looked).

I also think fixing this in the startup (kernel, ld.so, crt0)
is the only option not breaking any existing applicatiions.

As for the first-vs.-last debate​: I asked a random selection
of Linux and Mac OSX users with some programming knowledge,
and they unambigously said they expect the *last* occurrence
of the key to be used, n̲o̲t̲ the first.

Tony Cook via RT dixit​:

The first of the attached patches does that by reversing the order
we initialize %ENV.

Please do *not* do that!

bye,
//mirabilos
--
This space for rent.

@p5pRT
Copy link
Author

p5pRT commented Jan 6, 2016

From tg@mirbsd.de

Sebastian Krahmer dixit​:

On Wed, Jan 06, 2016 at 01​:08​:19PM +0000, Stephane CHAZELAS wrote​:

But can any of the env entries that a program *received* (as
opposed to the ones added later with setenv/putenv) be reordered
with glibc (or other libcs)? I've not managed to.

That�s possible, and I had that as next step after�

Can setenv("x", "value") update anything else than the *first*
entry? Or that upon the next execve(), that "x=value" may not be
the first one?

� this, which was very easy to do in MirBSD libc, but it
turns out that doing this breaks at least Perl.

Thats why I voted for removing (erasing) dups rather than ignoring them

Exactly.

(and also because of the de-sync envp[] thing).

POSIX says envp[] is not available because it would be de-synched;
setenv(3) is allowed to change environ.

Yes, see above about _removing_ dups. Also note that this example
is insecure either way since best practices is to either clearenv() and only

clearenv() is explicitly not in POSIX, but you can use execve()
or change environ in the application (which is also permissible).

then using setenv() or to pass the array itself to execve(). For exactly
the reason of junk/dups that may be inherited.

Indeed, which is why the de-duplication must happen at application
start before any user code is run, possibly in the kernel, otherwise
in the startup code.

ld.so to do so (if POSIX and ABI permits that).

POSIX doesn�t say anything about duplicates, and I�d say go, but
don�t forget static executables (oh, and those compiled with other
runtimes� FreePascal comes to mind, or anything Assembly).

bye,
//mirabilos
--
This space for rent.

@p5pRT
Copy link
Author

p5pRT commented Jan 6, 2016

From @stephane-chazelas

2016-01-06 13​:40​:11 +0000, Thorsten Glaser​:
[...]

As for the first-vs.-last debate​: I asked a random selection
of Linux and Mac OSX users with some programming knowledge,
and they unambigously said they expect the *last* occurrence
of the key to be used, n̲o̲t̲ the first.
[...]

Not sure what you asked them exactly, but I'd expect if you ask
them which one should getenv() or setenv() update, they would
say the first and that's what most things do (getenv/setenv,
python, lua, ruby, tcl, zsh, AT&T ksh at least).

That's a mute point as that's something that is not meant to
happen anyway.

And which one it is doesn't matter anyway as long as everyone
agrees on it. But as it looks like we may not be able to have a
guarantee of order, I agree it's probably safer to enforce
unique entries at a lower level (kernel, crt...) on startup.

Would that break things though? Until now, at least on
GNU/Linux, argv and envp are interchangeable. It's still
possible some things may use the environment in unconventional
fashion? (unlikely when many shells do remove the unconventional
entries).

--
Stephane

@p5pRT
Copy link
Author

p5pRT commented Jan 6, 2016

From tg@mirbsd.de

stephane.chazelas@​gmail.com via RT dixit​:

2016-01-06 13​:40​:11 +0000, Thorsten Glaser​:

As for the first-vs.-last debate​: I asked a random selection
of Linux and Mac OSX users with some programming knowledge,
and they unambigously said they expect the *last* occurrence
of the key to be used, n̲o̲t̲ the first.

Not sure what you asked them exactly, but I'd expect if you ask

I asked them that, when I pass x=1 x=2 x=3 to a program in the
environment, what they�d expect the value of x to be.

them which one should getenv() or setenv() update, they would
say the first and that's what most things do (getenv/setenv,
python, lua, ruby, tcl, zsh, AT&T ksh at least).

This is probably either due to actually using getenv/setenv
or not having thought about it enough. Consider​:

  env = {}
  env['x'] = '1'
  env['x'] = '2'
  env['x'] = '3'

I fail to see why there ought to be a distinction between
environment imports and in-language assignments in the first
place; additionally, for things like JSON, the last key wins
as well, so this is overall more consistent.

And which one it is doesn't matter anyway as long as everyone
agrees on it.

Getting everyone to agree probably will be a hell of fun�
(including my own stubbornness) �

guarantee of order, I agree it's probably safer to enforce
unique entries at a lower level (kernel, crt...) on startup.

Right.

Would that break things though? Until now, at least on
GNU/Linux, argv and envp are interchangeable. It's still

You mean environ and envp.

They are interchangable until the first setenv() with a
new value was called, but no longer, from that point.

But look at how main() gets called. I�m using the MirBSD
code as reference here, as I�m intimately familiar with
it, but I assume most ELF-based systems do the same. Let�s
look at a statically linked executable.

The kernel loads the ELF binary and jumps to its entry
point, which is usually some variant of the _start symbol,
as gcc calls ld with -e _start (or -e __start). That is
called �raw�, i.e. not with a function frame.

The implementation of __start then reads some arguments
from registers (such as ps_strings) and stack (argc,
for example) and calculates envp as the stack address
just behind argc plus argc+1 times the size of a pointer.
Then it constructs a C stack frame and calls some less
machine-specific function which does a couple of things,
depending on the OS. For MirBSD, that is​:

� set environ to envp
� set __progname as basename(argv[0])
� store ps_strings aside
â�¢ if profiling​: atexit(_mcleanup) and call monstartup
� call atexit(_fini); and _init();
  (_init() does things like constructors)
and its last line is​:
� exit(main(argc, argv, environ));

That means that the envp main() sees is not the envp
passed by the OS (actually, the OS doesn't pass one,
it just places it on the stack right after the argv
array) but whatever environ is set to.

Furthermore, we only d̲e̲duplicate here, i.e. we don�t
increase the size of the environment. environ[0],
environ[1], etc. are all places on the stack, pointing
to strings also on the stack. To deduplicate, we merely
have to memmove() things inside the environ array around,
which means that we don�t even need to change the value
of environ/envp but just its contents.

I remember looking at how Linux does things, and it also
just places the environment pointers on the stack after
the argument vector pointers IIRC.

Of course, doing this in the kernel would fix this the
best way as no code duplication (ld.so for dynamically
linked executables, crt0 for statically linked ones,
and some unknown amount of custom runtimes, different
C libraries (dietlibc, klibc, µClibc-ng, musl, �) and
who knows what else) is needed.

bye,
//mirabilos
--
This space for rent.

@p5pRT
Copy link
Author

p5pRT commented Jan 6, 2016

From @stephane-chazelas

2016-01-06 18​:30​:07 +0000, Thorsten Glaser​:
[...]

Would that break things though? Until now, at least on
GNU/Linux, argv and envp are interchangeable. It's still

You mean environ and envp.
[...]

FWIW, I did mean argv[] and envp[], as in they are just two
lists of arbitrary strings. Some applications, may use envp[]
for other purposes than passing a list of environment variables.

[...]

That means that the envp main() sees is not the envp
passed by the OS (actually, the OS doesn't pass one,
it just places it on the stack right after the argv
array) but whatever environ is set to.

But environ would be the address of the first pointer to the
first env string, that's something put there by execve().

Furthermore, we only d̲e̲duplicate here, i.e. we don�t
increase the size of the environment. environ[0],
environ[1], etc. are all places on the stack, pointing
to strings also on the stack. To deduplicate, we merely
have to memmove() things inside the environ array around,
which means that we don�t even need to change the value
of environ/envp but just its contents.

Just the content of the pointers, the duplicate env values would
still stay on the stack, but no longer reached.

I remember looking at how Linux does things, and it also
just places the environment pointers on the stack after
the argument vector pointers IIRC.

Yes, see
http​://unix.stackexchange.com/questions/91561/ps-full-command-is-too-long/91562#91562

the bottom of the stack is​:

argc argv-pointers envp-pointers argv-strings envp-strings.

Of course, doing this in the kernel would fix this the
best way as no code duplication (ld.so for dynamically
linked executables, crt0 for statically linked ones,
and some unknown amount of custom runtimes, different
C libraries (dietlibc, klibc, µClibc-ng, musl, �) and
who knows what else) is needed.
[...]

Yes. True.

It can potentially break things though.

On the other hand, changing bash,dash,yash,mksh,fish,perl so
they get the first duplicate would be harmless. That may not
completely fix the problem, but nobody has offered a
counter-example yet where for example​:

putenv("PATH=sane-value");
system("foo");

with a fixed "sh" (one behaving like zsh or ksh93) can be
fooled on current systems.

--
Stephane

@p5pRT
Copy link
Author

p5pRT commented Jan 6, 2016

From tg@mirbsd.de

(Why isn�t dgk (the Korn in ksh) in the Cc list, anyway?)

stephane.chazelas@​gmail.com via RT dixit​:

lists of arbitrary strings. Some applications, may use envp[]
for other purposes than passing a list of environment variables.

Well POSIX doesn�t guarantee it anyway.

But environ would be the address of the first pointer to the
first env string, that's something put there by execve().

And that is not going to change when we deduplicate.

Just the content of the pointers, the duplicate env values would
still stay on the stack, but no longer reached.

Exactly.

Even better if the kernel deduplicates, of course. That
needs the various kernels� hackers to do it�

the bottom of the stack is​:

argc argv-pointers envp-pointers argv-strings envp-strings.

Right, and we just make the envp-pointers a bit shorter
(and fill it up to the original size with NULL pointers).

It can potentially break things though.

Every change can potentially break things. This is the
place where I believe the least amount of them breaks ;)

On the other hand, changing bash,dash,yash,mksh,fish,perl so
they get the first duplicate would be harmless. That may not

I don�t currently see enough reasons to change mksh to give
the first value precedence. Later assignments a̲l̲w̲a̲y̲s̲ override
earlier assignments.

Furthermore, this would make 'sh' and '(env;cat)|sh' out of sync
which can cause even more �fun��

bye,
//mirabilos
--
This space for rent.

@p5pRT
Copy link
Author

p5pRT commented Jan 6, 2016

From @hvds

"Tony Cook via RT" <perl5-security-report@​perl.org> wrote​:
:On Mon Jan 04 15​:22​:04 2016, stephane.chazelas@​gmail.com wrote​:
:> - fix perl and yash so they get the first entry for a variable
:
:The first of the attached patches does that by reversing the order
:we initialize %ENV.

Technically, this looks good to me, independent of the discussion on
whether it's desirable.

:> - one could argue that setenv, putenv, or assigning env vars in
:> other languages should also remove duplicates.
:
:The second patch attempts to remove duplicates of an environment variable
:when it's set.

Other than minor comments below, this also looks good. I'm a bit worried
that Perl_vmssetenv() explicitly documents that it will set/remove only
the first copy found, it'd be useful to get some insight into the nature
of this under VMS and what effect this change for other platforms will
have on portability.

[...]
:+ nlen = strlen(nam);

For later​: note that this (and the same in alternate branches below) is the
same as len above, other than type; surely we can avoid calling strlen()
twice for these?

:+ if (environ[i]) { /* environ[i] is a match */

I think this could do with a clearer comment, maybe something like​:
  /* environ[i] now points to an entry to be removed; remove it, and any
  * further instances of the same name.
  */

: # if defined(__CYGWIN__)|| defined(__SYMBIAN32__) || defined(__riscos__) || (defined(__sun) && defined(HAS_UNSETENV))

Note this context line will conflict on rebase, since blead now has
PERL_DARWIN too.

:+ /* ensure any duplicates are removed */
:+ while (PerlEnv_getenv(nam))
:+ unsetenv(nam);

I'm nervous of infinite loops here and in the unsetenv+putenv case below.
I'd really like to see a test that would exercise these in the obscure
platform smokes, even if we have to release before we can see those.

: # else /* ! HAS_UNSETENV */

I think this and the putenv() case below need to comment that they won't
(and can't) kill dupes. Longer term, there might be value in a build
option that would refuse to build in these cases.

Hugo

@p5pRT
Copy link
Author

p5pRT commented Jan 7, 2016

From @tonycoz

On Wed Jan 06 06​:13​:54 2016, mirabilos wrote​:

Tony Cook via RT dixit​:

The first of the attached patches does that by reversing the order
we initialize %ENV.

Please do *not* do that!

One problem with the current behaviour is that the internals of perl (and XS code) that use getenv() and perl code can see two different values for an environment variable, eg​:

$ execve ./perl -Ilib -MPOSIX -le 'print $ENV{LANG}; print setlocale("");' EOA LANG=en_US.UTF-8 LANG=en_AU.UTF-8
en_AU.UTF-8
en_US.UTF-8

Reversing the order fixes that when getenv() works in what seems to me to be the most obvious way (a linear search from the start of environ.)

Another change I'm considering is aborting under -T if a duplicate is found.

I can see that causing problems if anyone is creating an envp[] for execve() by copying environ[] and appending new definitions to the end without removing duplicates.

Tony

@p5pRT
Copy link
Author

p5pRT commented Jan 7, 2016

From tg@mirbsd.de

Tony Cook via RT dixit​:

On Wed Jan 06 06​:13​:54 2016, mirabilos wrote​:

Tony Cook via RT dixit​:

The first of the attached patches does that by reversing the order
we initialize %ENV.

Please do *not* do that!

One problem with the current behaviour is that the internals of perl
(and XS code) that use getenv() and perl code can see two different
values for an environment variable, eg​:

Unless that has changed since Perl 5.8.8, if you enable putenv,
this will additionally break things (try with three or more values�
in my test I didn�t reverse but assigning to %ENV during import
would overwrite values when Perl order and setenv order don�t match).

I can see that causing problems if anyone is creating an envp[] for
execve() by copying environ[] and appending new definitions to the end
without removing duplicates.

Another point in favour of last-one-wins, and another point in
favour of deduplicating between handover to the kernel on execve()
and running any user code in the new process.

bye,
//mirabilos
--
This space for rent.

@p5pRT
Copy link
Author

p5pRT commented Jan 7, 2016

From @stephane-chazelas

2016-01-06 20​:02​:44 -0800, Tony Cook via RT​:
[...]

One problem with the current behaviour is that the internals
of perl (and XS code) that use getenv() and perl code can see
two different values for an environment variable, eg​:

$ execve ./perl -Ilib -MPOSIX -le 'print $ENV{LANG}; print setlocale("");' EOA LANG=en_US.UTF-8 LANG=en_AU.UTF-8
en_AU.UTF-8
en_US.UTF-8

Yes, I had already mentioned that in my initial report but
forgot to add it when /widening/ the discussion, sorry about
that. Original report attached in case there's more I forgot to
mention (my memory is not as good as it used to be...).

Reversing the order fixes that when getenv() works in what
seems to me to be the most obvious way (a linear search from
the start of environ.)

Yes, and one of the big questions is indeed "can we rely (in
practice, in existing implementations, I know POSIX gives no
guarantee there) on getenv() returning the first entry and on
setenv()/putenv() not re-shuffling the entries (at least the
order of entries received on the last execve() preserved for the
next execve())?"

Another change I'm considering is aborting under -T if a duplicate is found.

Sounds reasonable. As I noted earlier though, pre-shellshock
bashs could genuinely have two instances of a variable (but then
again, those duplicates would be stripped by some tools like
most shells, so it would already be broken)

I can see that causing problems if anyone is creating an
envp[] for execve() by copying environ[] and appending new
definitions to the end without removing duplicates.
[...]

If they do, they're broken already, as that duplicate entry
won't be accessible by most things including getenv().

--
Stephane

@p5pRT
Copy link
Author

p5pRT commented Jan 7, 2016

From @stephane-chazelas

Message RFC822:
X-RT-Original-Encoding: utf-8
From: Stephane Chazelas stephane.chazelas@gmail.com
User-Agent: Mutt/1.5.21 (2010-09-15)
MIME-Version: 1.0
Date: Fri, 20 Nov 2015 15:21:46 +0000
To: "Todd C. Miller" Todd.Miller@courtesan.com, team@security.debian.org
Subject: sudo/setuids and multiple environment entries with the same name
Message-ID: 20151120152146.GA10325@chaz.gmail.com
Content-Disposition: inline
Content-Type: text/plain; charset="utf-8"
Content-Length: 3845

Hello,

this is not an issue with sudo, but I think something sudo
(and the libc/dynamic linker at least in setuid/setgid...)
should guard against.

Consider this little C program:

#define _GNU_SOURCE
#include <unistd.h>
int main(int argc, char* argv[])
{
char *e[] = {"LC_ALL=tr_TR.UTF-8", "LC_ALL=tr_TR.UTF-8", 0};
execvpe(argv[1], &argv[1], e);
return 1;
}

to be used as:

./a.out cmd...

(on non-GNU systems, replace execvpe with execvp and cmd with
/path/to/cmd).

That calls cmd with LC_ALL=tr_TR.UTF-8 twice in the environment.

If I run it as:

./a.out sudo cmd

That variable is still passed twice.

Now, if we have a variable passed twice like that, the behaviour
varies between applications. getenv("LC_ALL") will return the
first one, $ENV{"LC_ALL"} in perl will return the last one, but
$ENV{"LC_ALL"} = "C" will update the first one (!).

Some shells $LC_ALL (zsh, AT&T ksh) will get the first one,
dash, bash, yash, pdksh the last one. Of those, all but zsh and
yash will remove the duplicates. perl/ruby/python won't.

setenv() and putenv() on GNU, like zsh, yash, perl, ruby, python
when assigning to LC_ALL will update the first one and leave the
other ones untouched.

What that means is that if I have a program that does:

putenv("LC_ALL=C"); /* make sure we're in a sane locale /
/
or setenv("LC_ALL", "C", 1) */
system("something");

Or:

#!/usr/bin/perl
$ENV{LC_ALL}="C";
system("something");

Or:

#! /usr/bin/ruby
ENV["LC_ALL"]="C";
exec("something");

Or:

#!/bin/zsh
LC_ALL_C something

If the "sh" called by system() is based on dash/bash/pdksh which
is usually the case on FOSS OSes, or "something" is a sh/perl
script or calls a sh/perl script, then LC_ALL will still be
tr_TR.UTF-8.

I like the tr_TR.UTF-8 locale in examples, because at least on
Debian, that's one locale where uppercase i is not I (so echo
evil | grep -i EVIL fails), the decimal separator is ",",
thousand separator ".", month/day/AM/PM have multi-byte
characters... and of course all the security issues related to
UTF-8 and non-C collation.

LC_ALL is one of the variables passed along by sudo at least on
Debian, but we could also have problems with any variable (like
PATH) by any other setuid/setgid program.

Basically, the vulnerability we have here is a way to bypass
environment sanitising in privilege escalation contexts by
passing the env var several times.

I've not made a review of what software may be impacted, but
there's bound to be some. PAM modules, setuid/setgids especially
should be checked against that.

IMO, the libc and sudo should strip those duplicates at least
for setuid/setgids. bash/dash/mksh already strip duplicates but
should retain the first one to be consistent with getenv(). perl
should also be fixed to get the first one in its $ENV (that's
definitly a bug there as it updates the first one when you
assign to $ENV elements, and when it calls setlocale(), it's the
first one that is used).

One could also argue that putenv/setenv or assigning env vars
in any of those languages should remove duplicates.

PoC:

$ cat a.c
#define _GNU_SOURCE
#include <unistd.h>
int main(int argc, char* argv[])
{
char e[] = {"LC_ALL=tr_TR.UTF-8", "LC_ALL=tr_TR.UTF-8", 0};
execvpe(argv[1], &argv[1], e);
return 1;
}
$ cat sanitize.c
#include <stdlib.h>
#include <unistd.h>
int main(int argc, char
argv[]) {putenv("LC_ALL=C"); execvp(argv[1], &argv[1]);}
$ make a
cc a.c -o a
$ make sanitize
cc sanitize.c -o sanitize
$ ./a sudo ./sanitize sh -c 'echo evil | grep -i EVIL'
$ ./a sudo ./sanitize sh -c 'date'
Cum Kas 20 14:21:49 GMT 2015
$ ./a sudo perl -e '$ENV{LC_ALL}="C"; system("date")'
Fri Nov 20 14:22:25 GMT 2015
$ ./a sudo perl -e '$ENV{LC_ALL}="C"; system("date>&2")'
Cum Kas 20 14:22:31 GMT 2015
$ ./a sudo zsh -c 'LC_ALL=C sh -c date'
Cum Kas 20 14:23:32 GMT 2015

--
Stephane

@p5pRT
Copy link
Author

p5pRT commented Jan 7, 2016

From @stephane-chazelas

Hello,

this is not an issue with sudo, but I think something sudo
(and the libc/dynamic linker at least in setuid/setgid...)
should guard against.

Consider this little C program​:

#define _GNU_SOURCE
#include <unistd.h>
int main(int argc, char* argv[])
{
  char *e[] = {"LC_ALL=tr_TR.UTF-8", "LC_ALL=tr_TR.UTF-8", 0};
  execvpe(argv[1], &argv[1], e);
  return 1;
}

to be used as​:

./a.out cmd...

(on non-GNU systems, replace execvpe with execvp and cmd with
/path/to/cmd).

That calls cmd with LC_ALL=tr_TR.UTF-8 twice in the environment.

If I run it as​:

./a.out sudo cmd

That variable is still passed twice.

Now, if we have a variable passed twice like that, the behaviour
varies between applications. getenv("LC_ALL") will return the
first one, $ENV{"LC_ALL"} in perl will return the last one, but
$ENV{"LC_ALL"} = "C" will update the first one (!).

Some shells $LC_ALL (zsh, AT&T ksh) will get the first one,
dash, bash, yash, pdksh the last one. Of those, all but zsh and
yash will remove the duplicates. perl/ruby/python won't.

setenv() and putenv() on GNU, like zsh, yash, perl, ruby, python
when assigning to LC_ALL will update the first one and leave the
other ones untouched.

What that means is that if I have a program that does​:

putenv("LC_ALL=C"); /* make sure we're in a sane locale */
/* or setenv("LC_ALL", "C", 1) */
system("something");

Or​:

#!/usr/bin/perl
$ENV{LC_ALL}="C";
system("something");

Or​:

#! /usr/bin/ruby
ENV["LC_ALL"]="C";
exec("something");

Or​:

#!/bin/zsh
LC_ALL_C something

If the "sh" called by system() is based on dash/bash/pdksh which
is usually the case on FOSS OSes, or "something" is a sh/perl
script or calls a sh/perl script, then LC_ALL will still be
tr_TR.UTF-8.

I like the tr_TR.UTF-8 locale in examples, because at least on
Debian, that's one locale where uppercase i is not I (so echo
evil | grep -i EVIL fails), the decimal separator is ",",
thousand separator ".", month/day/AM/PM have multi-byte
characters... and of course all the security issues related to
UTF-8 and non-C collation.

LC_ALL is one of the variables passed along by sudo at least on
Debian, but we could also have problems with any variable (like
PATH) by any other setuid/setgid program.

Basically, the vulnerability we have here is a way to bypass
environment sanitising in privilege escalation contexts by
passing the env var several times.

I've not made a review of what software may be impacted, but
there's bound to be some. PAM modules, setuid/setgids especially
should be checked against that.

IMO, the libc and sudo should strip those duplicates at least
for setuid/setgids. bash/dash/mksh already strip duplicates but
should retain the first one to be consistent with getenv(). perl
should also be fixed to get the first one in its $ENV (that's
definitly a bug there as it updates the first one when you
assign to $ENV elements, and when it calls setlocale(), it's the
first one that is used).

One could also argue that putenv/setenv or assigning env vars
in any of those languages should remove duplicates.

PoC​:

$ cat a.c
#define _GNU_SOURCE
#include <unistd.h>
int main(int argc, char* argv[])
{
  char *e[] = {"LC_ALL=tr_TR.UTF-8", "LC_ALL=tr_TR.UTF-8", 0};
  execvpe(argv[1], &argv[1], e);
  return 1;
}
$ cat sanitize.c
#include <stdlib.h>
#include <unistd.h>
int main(int argc, char*argv[]) {putenv("LC_ALL=C"); execvp(argv[1], &argv[1]);}
$ make a
cc a.c -o a
$ make sanitize
cc sanitize.c -o sanitize
$ ./a sudo ./sanitize sh -c 'echo evil | grep -i EVIL'
$ ./a sudo ./sanitize sh -c 'date'
Cum Kas 20 14​:21​:49 GMT 2015
$ ./a sudo perl -e '$ENV{LC_ALL}="C"; system("date")'
Fri Nov 20 14​:22​:25 GMT 2015
$ ./a sudo perl -e '$ENV{LC_ALL}="C"; system("date>&2")'
Cum Kas 20 14​:22​:31 GMT 2015
$ ./a sudo zsh -c 'LC_ALL=C sh -c date'
Cum Kas 20 14​:23​:32 GMT 2015

--
Stephane

@p5pRT
Copy link
Author

p5pRT commented Jan 11, 2016

From @rjbs

It seems to me like the sanest, simplest, and least confusing thing is to use Tony Cook's two patches to (a) make perl's env have the same contents you'd get from a linear search and (b) clean up dupes on set. Tony and I briefly discussed the possibility of cleaning up the environment itself on boot, but it didn't seem like a good way forward.

As for barfing on ambiguous environments​:

* I'd love doing that always if I felt more sure that it was nearly never the case in sane operation (but I really don't feel that way)
* I'm not keen on making it depend on -T, as it turns -T into "more security" rather than "taint"
* ...but that doesn't mean I think it's a bad idea, either

--
rjbs

@p5pRT
Copy link
Author

p5pRT commented Jan 11, 2016

From @jhi

On Monday-201601-11 18​:37, Ricardo SIGNES via RT wrote​:

* I'd love doing that always if I felt more sure that it was nearly never the case in sane operation (but I really don't feel that way)
* I'm not keen on making it depend on -T, as it turns -T into "more security" rather than "taint"
* ...but that doesn't mean I think it's a bad idea, either

Croak by default but have an environment variable

env PERL_ALLOW_DUP_ENV_FULL_SPEED_AHEAD_AND_DAMN_THE_TORPEDOES=1 ...

would be playing on the "safe by default" side, while still allowing a
foolhardy captain to steam ahead?

@p5pRT
Copy link
Author

p5pRT commented Jan 11, 2016

From @tonycoz

On Mon, Jan 11, 2016 at 06​:42​:36PM -0500, Jarkko Hietaniemi wrote​:

On Monday-201601-11 18​:37, Ricardo SIGNES via RT wrote​:

* I'd love doing that always if I felt more sure that it was nearly never the case in sane operation (but I really don't feel that way)
* I'm not keen on making it depend on -T, as it turns -T into "more security" rather than "taint"
* ...but that doesn't mean I think it's a bad idea, either

Croak by default but have an environment variable

env PERL_ALLOW_DUP_ENV_FULL_SPEED_AHEAD_AND_DAMN_THE_TORPEDOES=1 ...

would be playing on the "safe by default" side, while still allowing a
foolhardy captain to steam ahead?

Do you know if VMS can have multiple definitions of the same name in
the environment in normal cases?

Tony

@p5pRT
Copy link
Author

p5pRT commented Jan 11, 2016

From @jhi

I don't know VMS that intimately but I would suspect it does not,
since I think env there is basically a system-level lookup table,
instead of a manually-diddled single block of memory. In fact, it is
a hierarchy of lookup tables.

If we don't have Craig in perl-security, we should.

On Mon, Jan 11, 2016 at 6​:49 PM, Tony Cook <tony@​develop-help.com> wrote​:

On Mon, Jan 11, 2016 at 06​:42​:36PM -0500, Jarkko Hietaniemi wrote​:

On Monday-201601-11 18​:37, Ricardo SIGNES via RT wrote​:

* I'd love doing that always if I felt more sure that it was nearly never the case in sane operation (but I really don't feel that way)
* I'm not keen on making it depend on -T, as it turns -T into "more security" rather than "taint"
* ...but that doesn't mean I think it's a bad idea, either

Croak by default but have an environment variable

env PERL_ALLOW_DUP_ENV_FULL_SPEED_AHEAD_AND_DAMN_THE_TORPEDOES=1 ...

would be playing on the "safe by default" side, while still allowing a
foolhardy captain to steam ahead?

Do you know if VMS can have multiple definitions of the same name in
the environment in normal cases?

Tony

--
There is this special biologist word we use for 'stable'. It is
'dead'. -- Jack Cohen

@p5pRT
Copy link
Author

p5pRT commented Jan 11, 2016

From @rjbs

On Mon Jan 11 15​:43​:05 2016, jhi wrote​:

On Monday-201601-11 18​:37, Ricardo SIGNES via RT wrote​:

* I'd love doing that always if I felt more sure that it was nearly
never the case in sane operation (but I really don't feel that way)
* I'm not keen on making it depend on -T, as it turns -T into "more
security" rather than "taint"
* ...but that doesn't mean I think it's a bad idea, either

Croak by default but have an environment variable

env PERL_ALLOW_DUP_ENV_FULL_SPEED_AHEAD_AND_DAMN_THE_TORPEDOES=1 ...

would be playing on the "safe by default" side, while still allowing a
foolhardy captain to steam ahead?

So, my concern with that is something like this​:

If ambiguous environments are created by (possibly half-baked programs written in) shells, and those are the only places where this is going to approach "common" in the wild, then we're going to see zero reports of failure until things are failing on production systems.

Lots of people smoke their applications and libraries on bleadperl, but nobody "smokes" their production systems, cron jobs, glue, and so on. They also tend to run that stuff against system perl, meaning that they won't see the breakage when v5.x.0 comes out, but some years later when their scripts are ported to the new company standard, Excited Egret. Changes that will affect "the stuff sysadmins do" are the changes where I feel we must be the most careful.

So, with that in mind, are we better off having perl bail or having it change to use the first value? It's not entirely clear. Either way (putting aside the case where (X=1) is found multiple times in env), the behavior changes. My gut says that we're better off having programs still run, although I'm not sure I have any strong argument to make in favor of it.

--
rjbs

@p5pRT
Copy link
Author

p5pRT commented Jan 12, 2016

From @rjbs

On Mon Jan 11 15​:57​:32 2016, jhi wrote​:

If we don't have Craig in perl-security, we should.

Tony noted his absence and I sent him an invite earlier this evening. If he declines to join I'll at least get him to consult on this ticket.

--
rjbs

@p5pRT
Copy link
Author

p5pRT commented Jan 18, 2016

From @craigberry

I've reflected a bit more on this and concluded that as far as handling duplicates in the C run-time's environ array, nothing special needs to be done for VMS.

We don't currently run perl.c​:S_init_postdump_symbols, but I think we should and will enable it shortly. At which point Tony's first patch to take the first entry passed in when there are duplicates will do the same for us as anybody else.

VMS doesn't and mustn't use util.c​:Perl_my_setenv because of all the special magic in vms.c​:Perl_vmssetenv, so Tony's second patch won't have any effect on VMS. As of cda27dc, vms.c​:Perl_vmssetenv now uses the CRTL's unsetenv when removing something from the environ array. unsetenv has been available for twenty years and is documented to remove all versions when there's more than one. I've tested it and it works as advertised, so we're ok there.

Now to the fact that %ENV is concocted from multiple sources on VMS and one of those sources (logical names) has multiple layers, and that deleting a key in %ENV could thus leave another value for the same key visible. There are multiple problems with trying to change that to delete all versions​:

1.) It breaks documented behavior.
2.) It contradicts native expectations about how the environment works.
3.) It can never work reliably because successive versions are increasingly likely to be defined in places a user program does not have privileges to modify.
4.) It seems to me a contentious change (even if I have only myself to contend with) and I don't think I'm going to have it all figured (or even understand what exploits if any, are possible) before the contentious change freeze two days from now.

So I think I need to leave things as is for now, but at least there is nothing VMS-related that stands in the way of proceeding with other changes discussed here.

@p5pRT
Copy link
Author

p5pRT commented Jan 21, 2016

From @stephane-chazelas

2016-01-21 12​:08​:38 -0500, Red Hat Product Security​:
[...]

However, this appears to be one of the issues where it's very important to have
a general consensus on how exactly the fix should work and look like. I do
believe that finding this consensus will be a huge part of the work that needs
to be done to resolve this issue.

I don't think that what we have now, a collection of CCed parties, is a
suitable place for discussing this. Thus, I think we should ultimately move
this discussion to a more suitable place, such as a public mailing list.
libc-alpha was suggested before.
[...]

perl is possibly the most severely impacted (bypass of taint
mode) and runs on a variety of Unix-likes and non-Unix-likes.

From the discussion here, they've discussed changing their %ENV
so it picks the first entry like getenv(3) on "get" and remove
duplicates on "set".

Once that's done that would probably fix it on most systems,
then we can discuss the libc hardening.

Personaly, I'm happy for the discussion to go to libc-alpha as
long as the maintainers of perl and bash (ported to and used on
most systems, and primary "vectors") are happy about it.

If anybody has any objection to the issue being made public,
they should raise it now.

--
Stephane

@p5pRT
Copy link
Author

p5pRT commented Jan 21, 2016

From @fweimer

* Red Hat Product Security​:

In order to keep things moving​: Red Hat has assigned CVE-2016-0748
to a fix planned to be in glibc. I expect the libc fix to be some
kind of a "milestone", which would allow us to go ahead and poke at
all the other stuff that may need separate fixing.

However, this appears to be one of the issues where it's very
important to have a general consensus on how exactly the fix should
work and look like. I do believe that finding this consensus will be
a huge part of the work that needs to be done to resolve this issue.

I don't think that what we have now, a collection of CCed parties,
is a suitable place for discussing this. Thus, I think we should
ultimately move this discussion to a more suitable place, such as a
public mailing list. libc-alpha was suggested before.

Thoughts?

I agree with what Stefan wrote above. It's a local issue only, and
the complex interactions and backwards compatibility concerns call for
public review, IMHO.

And as I wrote, this has been documented as a source of security
issues as early as 1996. I agree it's time to fix this in a central
place, but where that should happen is not quite clear to me. I'm
leaning towards glibc, but I'm biased.

Thanks,
Florian

@p5pRT
Copy link
Author

p5pRT commented Jan 24, 2016

From @jhi

On Thu Jan 21 10​:22​:44 2016, fw@​deneb.enyo.de wrote​:

* Red Hat Product Security​:

In order to keep things moving​: Red Hat has assigned CVE-2016-0748
to a fix planned to be in glibc. I expect the libc fix to be some
kind of a "milestone", which would allow us to go ahead and poke at
all the other stuff that may need separate fixing.

However, this appears to be one of the issues where it's very
important to have a general consensus on how exactly the fix should
work and look like. I do believe that finding this consensus will be
a huge part of the work that needs to be done to resolve this issue.

I don't think that what we have now, a collection of CCed parties,
is a suitable place for discussing this. Thus, I think we should
ultimately move this discussion to a more suitable place, such as a
public mailing list. libc-alpha was suggested before.

Thoughts?

I agree with what Stefan wrote above. It's a local issue only, and
the complex interactions and backwards compatibility concerns call for
public review, IMHO.

And as I wrote, this has been documented as a source of security
issues as early as 1996. I agree it's time to fix this in a central
place, but where that should happen is not quite clear to me. I'm
leaning towards glibc, but I'm biased.

glibc (and correspondingly, the libc-alpha mailing list) specific discussion might not be directly relevant (*) to the BSD folks (I see at least Todd Miller on the CC list, and @​mirbsd) Anyone here from FreeBSD? NetBSD? Or Apple?

(*) Though, of course, while any particular patches for glibc will not directly applicable to non-glibc lands, in spirit and intent they might be useful.

Maybe there needs to be three different "forks" of this discussion​: the glibc, the *BSD camp, and then the shells/perl. (Non-UNIX environments, which at least Perl covers, are then yet another dimension, each their own.)

But until there's some sort of general overall plan and rough consensus (anyone care to attempt a summary?), such forking of discussion is probably premature.

Thanks,
Florian

@p5pRT
Copy link
Author

p5pRT commented Jan 26, 2016

From @rjbs

* Stephane CHAZELAS <stephane.chazelas@​gmail.com> [2016-01-21T12​:39​:08]

perl is possibly the most severely impacted (bypass of taint
mode) and runs on a variety of Unix-likes and non-Unix-likes.

From the discussion here, they've discussed changing their %ENV
so it picks the first entry like getenv(3) on "get" and remove
duplicates on "set".

Yes, and I think we need to apply that. Also, because this affects the taint
system, a Perl-specific feature meant to protect you from an untrusted
environment, I think this goes beyond the well-known (I am told :)) problem of
an ambiguous environment. I will be assigning the Perl-specific part of this
problem a CVE number and doing pre-disclosure of the issue to our downstream
packagers in the next few days, probably targeting Feb 9 for release.

Tony Cook is producing an updated patch, which I'll be distributing downstream.

If anybody has any objection to the issue being made public,
they should raise it now.

Although I wouldn't object to the immediate discussion of the problem in
general, it is almost certainly going to lead to scrutiny of things that
misbehave in the context of an ambiguous environment, meaning perl. I would
prefer if the public discussion could wait until then.

Thanks for your patience in waiting for a reply. I have been neglecting my
inbox whilst my hometown was smothered in an absurd amount of snow.

--
rjbs

@p5pRT
Copy link
Author

p5pRT commented Jan 26, 2016

From @craigberry

On Mon, Jan 18, 2016 at 12​:47 PM, Craig A. Berry via RT
<bugs-comment@​bugs6.perl.org> wrote​:

as far as handling duplicates in the C run-time's environ array, nothing special needs to be
done for VMS.

I've done further testing and now think the parts of
S_init_postdump_symbols that we skip on VMS need to stay skipped for
now. Which means Tony's first patch needs to be augmented by the
attached patch to the equivalent bits in vms/vms.c. I will try to
remember to apply it after the public disclosure, but if it could be
simply added to the proposed patch circulated with the CVE, that would
be great.

Inline Patch
--- vms/vms.c;-0 2016-01-16 18:36:16 -0600
+++ vms/vms.c 2016-01-21 15:51:27 -0600
@@ -1307,7 +1307,9 @@ prime_env_iter(void)
     if (!str$case_blind_compare(env_tables[i],&crtlenv)) {
       char *start;
       int j;
-      for (j = 0; environ[j]; j++) {
+      /* Start at the end, so if there is a duplicate we keep the first one. */
+      for (j = 0; environ[j]; j++);
+      for (j--; j >= 0; j--) {
         if (!(start = strchr(environ[j],'='))) {
           if (ckWARN(WARN_INTERNAL))
             Perl_warner(aTHX_ packWARN(WARN_INTERNAL),"Ill-formed
CRTL environ value \"%s\"\n",environ[j]);

@p5pRT
Copy link
Author

p5pRT commented Jan 26, 2016

From @craigberry

no_dup_env_vms.patch
--- vms/vms.c;-0	2016-01-16 18:36:16 -0600
+++ vms/vms.c	2016-01-21 15:51:27 -0600
@@ -1307,7 +1307,9 @@ prime_env_iter(void)
     if (!str$case_blind_compare(env_tables[i],&crtlenv)) {
       char *start;
       int j;
-      for (j = 0; environ[j]; j++) { 
+      /* Start at the end, so if there is a duplicate we keep the first one. */
+      for (j = 0; environ[j]; j++);
+      for (j--; j >= 0; j--) { 
         if (!(start = strchr(environ[j],'='))) {
           if (ckWARN(WARN_INTERNAL)) 
             Perl_warner(aTHX_ packWARN(WARN_INTERNAL),"Ill-formed CRTL environ value \"%s\"\n",environ[j]);

@p5pRT
Copy link
Author

p5pRT commented Jan 26, 2016

From @stephane-chazelas

2016-01-21 12​:08​:38 -0500, Red Hat Product Security​:
[...]

In order to keep things moving​: Red Hat has assigned CVE-2016-0748 to a fix
planned to be in glibc. I expect the libc fix to be some kind of a "milestone",
which would allow us to go ahead and poke at all the other stuff that may need
separate fixing.

However, this appears to be one of the issues where it's very important to have
a general consensus on how exactly the fix should work and look like. I do
believe that finding this consensus will be a huge part of the work that needs
to be done to resolve this issue.
[...]

Also note that doing it at the libc level means
- doing it for every libc. On Linux alone, there are at least 3
that I know, and there's one more per other system.
- recompiling static binaries (at least the setuid/setgid...
ones that may execute commands).
- could break things, so maybe should be done only for
setuid/setgids which means shells would still potentially have
a problem.

Also, if perl fixes it with a CVE, it's going to become very
obvious that bash, and BSDs "sh"s have the same issue.
BTW, I've just found out that even though ksh93 is not affected,
ksh88 seems to be (at least /usr/xpg4/bin/sh on Solaris). The
Bourne shell may be as well, at least the Heirloom toolchest
one (would affect Solaris 10 and earlier /bin/sh).

I'm under the impression that fixing bash and BSDs sh (and perl
which is already under way) is easier from that point of view.

A deduplicating at the libc/kernel level would still be a good
idea, as a hardening measure in case there still are
applications that do use envp[] as a list of variables but
query/set environment variables in a different way from
getenv/putenv/setenv.

I suspect it may be difficult to get a consensus among Unix
vendors on the scope of that and how it should be done.

--
Stephane

@p5pRT
Copy link
Author

p5pRT commented Jan 26, 2016

From tg@mirbsd.de

stephane.chazelas@​gmail.com via RT dixit​:

Also note that doing it at the libc level means
- doing it for every libc. On Linux alone, there are at least 3
that I know, and there's one more per other system.

Linux​:
� libc5 (yes I know)
� glibc
  â�£ they said something about doing it in ld.so which would
  break for static binaries, so theyâ��d have to do it twice
  â�£ OTOH this would fix Hurd and kFreeBSD in one go
� musl
� klibc
� dietlibc
� µClibc
� µClibc-ng
� bionic
� the one I�m writing� (not currently officially published)

Hurd​: I only know of glibc

� plus at least one libc per BSD, but they also have one
kernel each� though fixing it in the FreeBSD kernel would
also fix kFreeBSD in the same go.

I�d *really* prefer this to be fixed in the various kernels
(although, as userspace is allowed to change what environ
points to, the libc may need to do that on reinitialisation�
but only if we decide that�s an attack vector; it may not be
since normalisation in the kernel happens at execve time).

Now my own stance​:

BSD OS developer PoV​: Iâ��m likely to fix it in the kernel
if I manage to do that (admittedly my kernel development
skills are subpar), and possibly add something to libc as
its *env() functions are severely buggy anyway. Although,
I�m still undecided if kernel normalisation will use the
first or the last occurrence (not looking at mksh, both
make sense) and open to arguments for either from others
(especially the OpenBSD people, as they have sorta the
same code (modulo a couple of years�) and they�re usually
technically right).

mksh tool PoV​: Iâ��m not going to change mksh; it will always
continue to use the *last* occurrence as this is a requirement
for re-entrancy of environment dump/import.

bye,
//mirabilos
--
Stéphane, I actually don�t block Googlemail, they�re just too utterly
stupid to successfully deliver to me (or anyone else using Greylisting
and not whitelisting their ranges). Same for a few other providers such
as Hotmail. Some spammers (Yahoo) I do block.

@p5pRT
Copy link
Author

p5pRT commented Jan 26, 2016

From @stephane-chazelas

2016-01-26 09​:46​:48 -0800, Thorsten Glaser via RT​:
[...]

mksh tool PoV​: Iâ��m not going to change mksh; it will always
continue to use the *last* occurrence as this is a requirement
for re-entrancy of environment dump/import.
[...]

I'm not sure I understand that argument.

mksh deduplicates the environment already.

So if you dump the environmnet from mksh, you'll see only one
entry per variable (currently the last one, not the one that was
sanitized by your caller in the security issue being discussed
here).

Importing that dump will set the same variable.

If you change that to be the first one, still deduplicated, you
fix the security issue, your dump will still have one entry,
which will still set the same variable on import.

Can you please clarify what would break if you changed mksh?

--
Stephane

@p5pRT
Copy link
Author

p5pRT commented Jan 26, 2016

From tg@mirbsd.de

stephane.chazelas@​gmail.com via RT dixit​:

So if you dump the environmnet from mksh, you'll see only one

Yes, but you can dump it from a called program instead.

Can you please clarify what would break if you changed mksh?

Consistency � the notation of the environment as a set of
key=value assignments read in order.

bye,
//mirabilos
--
Stéphane, I actually don�t block Googlemail, they�re just too utterly
stupid to successfully deliver to me (or anyone else using Greylisting
and not whitelisting their ranges). Same for a few other providers such
as Hotmail. Some spammers (Yahoo) I do block.

@p5pRT
Copy link
Author

p5pRT commented Jan 26, 2016

From @jhi

A deduplicating at the libc/kernel level would still be a good
idea, as a hardening measure in case there still are
applications that do use envp[] as a list of variables but
query/set environment variables in a different way from
getenv/putenv/setenv.

I think we need to think of defense in depth here​: libc/kernel *and* the
perl/shells.

It's not like every system will want to / will be able to / will ever
upgrade to a newer libc. With the number of libc/kernel combinations
just across Linux and *BSD, never mind other UNIXy systems, doubly never
mind non-UNIXy systems, it is almost guaranteed that some of those
combinations will not be patched / will be patched differently / will
be patched but somewhat later.

@p5pRT
Copy link
Author

p5pRT commented Jan 27, 2016

From @tonycoz

On Mon Jan 25 21​:09​:09 2016, perl.security@​rjbs.manxome.org wrote​:

Tony Cook is producing an updated patch, which I'll be distributing
downstream.

It's not so much an updated patch, but a new patch.

This uses a different approach, rather than de-dupping a given name in environ[] on my_setenv() it scans for duplicates when we load %ENV, and if it finds any, removes them.

Unlike the previous patch, where every use of my_setenv() paid a cost for the possibility of duplicate entries, for this patch there's (mostly) only a cost when duplicate entries are found.

("mostly" because there's an extra hv_exists() call perl entry on the non-duplicates path.)

Tony

@p5pRT
Copy link
Author

p5pRT commented Jan 27, 2016

From @tonycoz

0001-perl-127158-remove-duplicate-environment-variables-f.patch
From 8a5c18363465c8ce7477dfa604afad6be02d8726 Mon Sep 17 00:00:00 2001
From: Tony Cook <tony@develop-help.com>
Date: Wed, 27 Jan 2016 11:52:15 +1100
Subject: [perl #127158] remove duplicate environment variables from environ

If we see duplicate environment variables while iterating over
environ[]:

a) make sure we use the same value in %ENV that getenv() returns.

Previously on a duplicate, %ENV would have the last entry for the name
from environ[], but a typical getenv() would return the first entry.

Rather than assuming all getenv() implementations return the first entry
explicitly call getenv() to ensure they agree.

b) remove duplicate entries from environ

Previously if there was a duplicate definition for a name in environ[]
setting that name in %ENV could result in an unsafe value being passed
to a child process, so ensure environ[] has no duplicates.
---
 perl.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 49 insertions(+), 2 deletions(-)

diff --git a/perl.c b/perl.c
index c7c1fe6..f33222b 100644
--- a/perl.c
+++ b/perl.c
@@ -4320,23 +4320,70 @@ S_init_postdump_symbols(pTHX_ int argc, char **argv, char **env)
 	}
 	if (env) {
 	  char *s, *old_var;
+          STRLEN nlen;
 	  SV *sv;
+          HV *dups = newHV();
+
 	  for (; *env; env++) {
 	    old_var = *env;
 
 	    if (!(s = strchr(old_var,'=')) || s == old_var)
 		continue;
+            nlen = s - old_var;
 
 #if defined(MSDOS) && !defined(DJGPP)
 	    *s = '\0';
 	    (void)strupr(old_var);
 	    *s = '=';
 #endif
-	    sv = newSVpv(s+1, 0);
-	    (void)hv_store(hv, old_var, s - old_var, sv, 0);
+            if (hv_exists(hv, old_var, nlen)) {
+                const char *name = savepvn(old_var, nlen);
+
+                /* make sure we use the same value as getenv(), otherwise code that
+                   uses getenv() (like setlocale()) might see a different value to %ENV
+                 */
+                sv = newSVpv(PerlEnv_getenv(name), 0);
+
+                /* keep a count of the dups of this name so we can de-dup environ later */
+                if (hv_exists(dups, name, nlen))
+                    ++SvIVX(*hv_fetch(dups, name, nlen, 0));
+                else
+                    (void)hv_store(dups, name, nlen, newSViv(1), 0);
+
+                Safefree(name);
+            }
+            else {
+                sv = newSVpv(s+1, 0);
+            }
+	    (void)hv_store(hv, old_var, nlen, sv, 0);
 	    if (env_is_not_environ)
 	        mg_set(sv);
 	  }
+          if (HvKEYS(dups)) {
+              /* environ has some duplicate definitions, remove them */
+              HE *entry;
+              hv_iterinit(dups);
+              while ((entry = hv_iternext_flags(dups, 0))) {
+                  STRLEN nlen;
+                  const char *name = HePV(entry, nlen);
+                  IV count = SvIV(HeVAL(entry));
+                  IV i;
+                  SV **valp = hv_fetch(hv, name, nlen, 0);
+
+                  assert(valp);
+
+                  /* try to remove any duplicate names, depending on the
+                   * implementation used in my_setenv() the iteration might
+                   * not be necessary, but let's be safe.
+                   */
+                  for (i = 0; i < count; ++i)
+                      my_setenv(name, 0);
+
+                  /* and set it back to the value we set $ENV{name} to */
+                  my_setenv(name, SvPV_nolen(*valp));
+              }
+          }
+          SvREFCNT_dec_NN(dups);
       }
 #endif /* USE_ENVIRON_ARRAY */
 #endif /* !PERL_MICRO */
-- 
2.1.4

@p5pRT
Copy link
Author

p5pRT commented Jan 27, 2016

From @rjbs

* Jarkko Hietaniemi <jhi@​iki.fi> [2016-01-26T13​:45​:07]

A deduplicating at the libc/kernel level would still be a good
idea, as a hardening measure in case there still are
applications that do use envp[] as a list of variables but
query/set environment variables in a different way from
getenv/putenv/setenv.

I think we need to think of defense in depth here​: libc/kernel *and* the
perl/shells.

This is my position as well. Of course I would like everyone to have all the
latest patches for libc and kernel problems, but I don't think this is a
situation where we can simply say "go do that," especially because of the
enormous number of kernels and libcs we support!

--
rjbs

@p5pRT
Copy link
Author

p5pRT commented Jan 27, 2016

From @rjbs

* Stephane CHAZELAS <stephane.chazelas@​gmail.com> [2016-01-26T12​:27​:02]

Also, if perl fixes it with a CVE, it's going to become very
obvious that bash, and BSDs "sh"s have the same issue.

Yes. I suppose we can try to coordinate at least *some* of the many affected
things, but is this a fool's errand?

Chet, would you, at any rate, like to do that?

I have not yet started notification for this problem, as we just got another
taint-related ticket today and I'd like to combine them if we agree with the
other report. Tomorrow, maybe.

--
rjbs

@p5pRT
Copy link
Author

p5pRT commented Jan 27, 2016

From chet.ramey@case.edu

-----BEGIN PGP SIGNED MESSAGE-----
Hash​: SHA1

On 1/26/16 10​:57 PM, Ricardo Signes wrote​:

* Stephane CHAZELAS <stephane.chazelas@​gmail.com> [2016-01-26T12​:27​:02]

Also, if perl fixes it with a CVE, it's going to become very
obvious that bash, and BSDs "sh"s have the same issue.

Yes. I suppose we can try to coordinate at least *some* of the many affe
cted
things, but is this a fool's errand?

Chet, would you, at any rate, like to do that?

I'm not enthusiastic about any of the proposed `solutions'. Stephane's
statement could just as easily apply to any of the C libraries on any
affected system, or any program that manipulates its environment
without relying on getenv/setenv/putenv. I think the right place to
apply any fix is in libc.

I'm sympathetic to Thorsten's argument that the environment is an ordered
set of name-value pairs and there is no reason to prefer the first
instance of a particular name other than the (fairly weak) argument that
"getenv/setenv do it that way." There's a consistency in treating
something like "a=1 a=2 a=3 prog" identically to `prog' receiving an
environment with [0]='a=1' [1]='a=2' [2]='a=3'.

The question is whether `fixing' shells and other programs to behave like
getenv and leaving the root cause (at least temporarily) unaddressed will
do any good.

Chet
- --
``The lyf so short, the craft so long to lerne.'' - Chaucer
  ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, ITS, CWRU chet@​case.edu http​://cnswww.cns.cwru.edu/~chet/
-----BEGIN PGP SIGNATURE-----
Version​: GnuPG v2

iEYEARECAAYFAlao8zQACgkQu1hp8GTqdKvIeACfcaZhMJrrEIHkAxSlNooz/e01
f4wAmwam6Wgnc1OoVm2/0d3rS4JfMC5q
=tzf0
-----END PGP SIGNATURE-----

@p5pRT
Copy link
Author

p5pRT commented Jan 27, 2016

From @stephane-chazelas

2016-01-27 11​:41​:34 -0500, Chet Ramey​:
[...]

I'm sympathetic to Thorsten's argument that the environment is an ordered
set of name-value pairs and there is no reason to prefer the first
instance of a particular name other than the (fairly weak) argument that
"getenv/setenv do it that way."

The argument is that if we do it differently from
getenv/setenv, then we're introducing the security
vulnerability described here.

perl's latest approach to actually call getenv to assign %ENV
elements would avoid the question of which order to scan the
environment.

There's a consistency in treating
something like "a=1 a=2 a=3 prog" identically to `prog' receiving an
environment with [0]='a=1' [1]='a=2' [2]='a=3'.

both of which are pathological cases. (And for the first case,
several historic shell implementations have performed those
assignments right to left, "a=1 a=2 a=3 printenv a" still prints
"a" with Solaris 10 /bin/sh for instance)

Again changing bash won't break anything. Two env variables with
the same name is a pathological case, it should not happen in
real life, the only reason to do it would be to try and exploit
this vulnerability.

The question is whether `fixing' shells and other programs to behave like
getenv and leaving the root cause (at least temporarily) unaddressed will
do any good.
[...]

One consideration is that "fixing" bash once now will fix it
wherever bash is deployed. Fixing the "root cause" involves
changing a lot of different software and may never happen on
some systems and I can't see it happen before perl publishes
their fix.

I agree we should fix the root case, but "fixing" bash won't do
any harm (note that ksh93 and zsh already do it) and will
definitely help.

--
Stephane

@p5pRT
Copy link
Author

p5pRT commented Jan 28, 2016

From @rjbs

* Stephane CHAZELAS <stephane.chazelas@​gmail.com> [2016-01-27T12​:20​:17]

2016-01-27 11​:41​:34 -0500, Chet Ramey​:

The question is whether `fixing' shells and other programs to behave like
getenv and leaving the root cause (at least temporarily) unaddressed will
do any good.
[...]

One consideration is that "fixing" bash once now will fix it
wherever bash is deployed. Fixing the "root cause" involves
changing a lot of different software and may never happen on
some systems and I can't see it happen before perl publishes
their fix.

I agree with Stephane here, but let me put that agreement aside for a moment​:
perl is affected in a way that other pieces of software are not, because of its
taint system. It's not reasonable for us to leave that broken while waiting
for every libc to be fixed, so we're going to have to fix that. When we do,
presumably, it will draw some attention to the problem space.

So, perl has to make a change. We can hold off a bit on making the change in
case others would like to synchronize. If everyone else is content to wait for
libc fixes, though, we're not going to wait indefinitely.

--
rjbs

@p5pRT
Copy link
Author

p5pRT commented Jan 28, 2016

From @stephane-chazelas

2016-01-27 17​:20​:17 +0000, Stephane CHAZELAS​:
[...]

I agree we should fix the root case, but "fixing" bash won't do
any harm (note that ksh93 and zsh already do it) and will
definitely help.
[...]

For the record, it would seem the "issue" was "fixed" (though it
doesn't appear the fix specifically addressed that
vulnerability) in zsh in 1997 in 3.1.3 (3.1.2-zefram1) with this
commit​:

http​://www.zsh.org/mla/workers/1997/msg00555.html

(while importing variables from the environment, don't import
variables that are already marked as exported).

--
Stephane

@p5pRT
Copy link
Author

p5pRT commented Jan 28, 2016

From @stephane-chazelas

On 28 January 2016 at 10​:44, Stephane CHAZELAS
<stephane.chazelas@​gmail.com> wrote​:

For the record, it would seem the "issue" was "fixed" (though it
doesn't appear the fix specifically addressed that
vulnerability) in zsh in 1997 in 3.1.3 (3.1.2-zefram1) with this
commit​:

http​://www.zsh.org/mla/workers/1997/msg00555.html

(while importing variables from the environment, don't import
variables that are already marked as exported).
[...]

Sorry, my bad. That commit actually did mention that it was doing the
same thing as ksh
at the time and picking the second env var when there are duplicates.

It was actually fixed by this commit​:
http​://www.zsh.org/mla/workers/2000/msg03237.html
(4.0 (3.1.9-dev-4) in 2000)

(again not specifically to address the vulnerability)

--
Stephane

@p5pRT
Copy link
Author

p5pRT commented Feb 13, 2016

From @rjbs

* Tony Cook via RT <perl5-security-report@​perl.org> [2016-01-26T20​:04​:03]

On Mon Jan 25 21​:09​:09 2016, perl.security@​rjbs.manxome.org wrote​:

Tony Cook is producing an updated patch, which I'll be distributing
downstream.

It's not so much an updated patch, but a new patch.

This got a bit derailled by a failed attempt to coordinate disclosure. Nothing
has come of that, so I'm moving forward. Perl security team​: I just pushed
branch "env-dupes" to the security repository. I will send that downstream
with an embargo on Tuesday if I don't get objections.

Craig​: this also includes your vms.c patch.

I made matching branches for 5.22 and 5.20.

--
rjbs

@p5pRT
Copy link
Author

p5pRT commented Feb 17, 2016

From @rjbs

The Perl ENV problems have been assigned CVE-2016-2381.

--
rjbs

@p5pRT
Copy link
Author

p5pRT commented Mar 1, 2016

From @fweimer

* Florian Weimer​:

* Chet Ramey​:

I agree that this is something that should be fixed in the C library or the
kernel.

I agree. I would welcome a discussion on libc-alpha. Any fix will
have to be subject to public review because it's a pretty significant
change. One problem is that we have to implement very early in
process startup in glibc, where nothing of the library proper is
available for use.

Now that the issue is public, I filed a glibc bug for it​:

<https://sourceware.org/bugzilla/show_bug.cgi?id=19749>

Thanks,
Florian

@p5pRT
Copy link
Author

p5pRT commented Mar 2, 2016

From @tonycoz

On Tue Mar 01 07​:47​:40 2016, fw@​deneb.enyo.de wrote​:

Now that the issue is public, I filed a glibc bug for it​:

Is there anything that's been discussed in this thread that anyone doesn't want public yet?

Tony

@p5pRT
Copy link
Author

p5pRT commented Mar 9, 2016

From @tonycoz

On Wed Mar 02 15​:19​:13 2016, tonyc wrote​:

On Tue Mar 01 07​:47​:40 2016, fw@​deneb.enyo.de wrote​:

Now that the issue is public, I filed a glibc bug for it​:

Is there anything that's been discussed in this thread that anyone
doesn't want public yet?

Now public.

Tony

@p5pRT
Copy link
Author

p5pRT commented Mar 9, 2016

@tonycoz - Status changed from 'open' to 'pending release'

@p5pRT
Copy link
Author

p5pRT commented May 13, 2016

From @khwilliamson

Thank you for submitting this report. You have helped make Perl better.
 
With the release of Perl 5.24.0 on May 9, 2016, this and 149 other issues have been resolved.

Perl 5.24.0 may be downloaded via https://metacpan.org/release/RJBS/perl-5.24.0

@p5pRT
Copy link
Author

p5pRT commented May 13, 2016

@khwilliamson - Status changed from 'pending release' to 'resolved'

@p5pRT p5pRT closed this as completed May 13, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant