[Repoze-dev] Fwd: DRAFT: Invitation to a dance

Chris McDonough chrism at plope.com
Sun Sep 30 00:12:00 UTC 2007


Another message I managed not to CC...

Begin forwarded message:

> From: Chris McDonough <chrism at plope.com>
> Date: September 29, 2007 8:09:23 PM EDT
> To: "Graham Dumpleton" <graham.dumpleton at gmail.com>
> Cc: "Philipp von Weitershausen" <philipp at weitershausen.de>,  
> "Christian Theune" <ct at gocept.com>, "Martijn Faassen"  
> <faassen at startifact.com>, "Jim Fulton" <jim at zope.com>, "Ian  
> Bicking" <ianb at colorstudy.com>, "Phillip J. Eby"  
> <pje at telecommunity.com>, "Rob Miller" <ra at burningman.com>, "Joel  
> Burton" <joel at joelburton.com>, "Martin Aspeli" <optilude at gmx.net>
> Subject: Re: DRAFT: Invitation to a dance
>
> On Sep 29, 2007, at 7:20 PM, Graham Dumpleton wrote:
>
>> On 29/09/2007, Chris McDonough <chrism at plope.com> wrote:
>>>> I'm quite impressed by the mod_wsgi integration. I need to try this
>>>> again. How does it deal multithreadedness and interpreter boostrap?
>>>
>>> This story is not nearly as cooked as the "development sandbox" or
>>> "pure python server" story.  There's still a lot of figuring out to
>>> do here.  I played around and found that mod_wsgi can be configured
>>> to run programs in a mode a lot like fastcgi, where it starts up
>>> processes that don't live "inside" Apache;
>>
>> The mod_wsgi daemon processes do still run in the context of Apache,
>> ie., they are a fork only and not a fork/exec of some separate
>> program. As an aid, see diagrams attached to my last post in the
>> following discussion:
>>
>> http://groups.google.com/group/modwsgi/browse_frm/thread/ 
>> 17f9659b3c50bf27/433167b1bdaef532
>
> Wonderful, thank you!  I will try to digest these.
>
>>
>>> you can sort of turn knobs
>>> up and down for number-of-threads and number-of-processes for each
>>> one of these (I must admit I don't quite understand the number-of-
>>> threads parameter yet),
>>
>> For a bit of helpful information read:
>>
>>   http://code.google.com/p/modwsgi/wiki/ProcessesAndThreading
>>
>> When using multiple threads in a mod_wsgi daemon process, it is akin
>> to running Apache worker MPM and embedded mode.
>>
>>> as well as which user the process should be
>>> started as (which is very handy).  For http://www.repoze.org/tmp/
>>> plone, the configuration is:
>>>
>>> WSGIPythonExecutable /home/repoze/tmp/site/bin/python
>>
>> You are misunderstanding what WSGIPythonExecutable directive in
>> mod_wsgi is for. In general this directive should never be used.
>> Apache/mod_wsgi will always use the Python runtime library that it  
>> was
>> compiled against (even in daemon mode). This is done through the
>> Python library being linked to the Apache mod_wsgi module.
>>
>> What the WSGIPythonExecutable directive is for is to workaround a
>> short coming of Python as to how it determines where the installed
>> Python lib directory is.
>>
>> What happens is that Python runtime when initialised tries to work  
>> out
>> where the Python library directory is by looking for where the Python
>> executable is in the PATH of the process, even though the python
>> program isn't executed in this case. The problem is that if the  
>> Apache
>> process uses a PATH that would result in the Python runtime finding a
>> different python executable than what mod_wsgi was compiled against
>> (ie., multiple instance of Python installed in different root
>> directories), then the wrong Python lib directory will be used and
>> thus the wrong common modules and site-packages directory.
>
> I think we're actually using it correctly, because what we're after  
> is a form of that use case.  The Python we're trying to point to is  
> a "virtual" Python (see http://peak.telecommunity.com/dist/virtual- 
> python.py).  The Python that mod_wsgi is compiled against is also  
> the Python which is the source of the "virtual" Python.   
> Essentially two things may differ between the "virtual" Python and  
> the Python used to create it: the packages which are in site- 
> packages may differ, and the distutils.cfg may differ.  Its version  
> cannot differ.  Really it's if I had installed exactly the same  
> Python version compiled with the same toolchain and libraries  
> installed within a different location (e.g. one that doesn't happen  
> to be on the Apache process' PATH), but it's just done through  
> symlink hackery instead of files.
>
> Being able to specify the interpreter on a per-VirtualHost basis  
> would prevent us from needing to do any sys.path munging in the  
> wsgi loader.  In particular, it would let us reuse the Python  
> 'site' module behavior that .pth files put into  "sys.prefix + '/ 
> pythonX.X/lib/site-packages'" are consulted for extra info that  
> extends sys.path (used heavily by eggs).  This is consistent with  
> the idea that each of our applications will use the same Python  
> version; they just must have different sys.path settings and each  
> is represented within the context of a separate virtual Python.   
> See http://bob.pythonmac.org/archives/2005/02/06/using-pth-files- 
> for-python-development/ for more info about .pth files and site  
> directories.
>
> I understand that adding something like this would lead to some  
> people misunderstanding its purpose and believing that you could  
> actually use multiple Python versions simultaneously, but it sure  
> would make life a lot easier for people who wanted to use it this way.
>
>>> WSGIDaemonProcess tmp threads=1 processes=4 maximum-requests=10000
>>>
>>> <Directory /home/repoze/tmp/site/etc>
>>>    Order deny,allow
>>>    Allow from all
>>> </Directory>
>>>
>>> <VirtualHost *:80>
>>>    DocumentRoot /home/repoze/www/www.repoze.org
>>>    ServerName www.repoze.org
>>>    ScriptAlias /viewcvs "/usr/lib/cgi-bin/viewcvs.cgi"
>>>    ServerAdmin repoze-dev at repoze.org
>>>    WSGIScriptAlias /tmp /home/repoze/tmp/site/etc/zope2.wsgi
>>>    WSGIProcessGroup tmp
>>>    WSGIPassAuthorization On
>>>    SetEnv HTTP_X_VHM_HOST http://www.repoze.org/tmp
>>> </VirtualHost>
>>>
>> The intent with mod_wsgi is that the WSGI script file is where you
>> specify everything. Thus, sys.path needs to be modified in it. If you
>> want to have different working environments for different  
>> applications
>> then you can use workingenv. See:
>>
>>   http://docs.pythonweb.org/pages/viewpage.action?pageId=5439610
>
> Workingenv is no longer maintained: Ian is on to  
> "virtualenv" (http://pypi.python.org/pypi/virtualenv) which works  
> almost exactly like virtual_python.py except it works on Windows  
> and provides some additional tools for customization.   Workingenv  
> is basically just a fancy way to set sys.path from what I  
> understood of it, and I believe Ian abandoned it because it's more  
> convenient and predictable to not have to write additional code  
> that munges sys.path at all.
>
>> It is recommended though that applications with different working set
>> of Python modules be run in different mod_wsgi daemon processes else
>> you might have problems if different applications want different
>> versions of C extension modules as Python only loads them once and
>> different sub interpreters cant be using different versions.
>
> I suspect we are always going to run Zope in daemon mode because  
> its the safest thing to do and mod_wsgi is nowhere near the  
> bottleneck here.
>
>>
>>> and because the Paste pipeline
>>> doesn't seem to be created until the first request, it can take a
>>> while for the site to "settle down" once it comes up (it might take
>>> 30 seconds to render a single image at first startup time, because
>>> it's boostrapping a process).
>>
>> It is not bootstrapping a process, although it is bootstrapping your
>> WSGI application. This is only done when the script file is loaded as
>> Apache/mod_wsgi is not a solution for hosting just a single WSGI
>> application. Thus, in general you can't know what may need to be
>> loaded in advance and you can only know once the request arrives and
>> you see what WSGI application a URL maps to.
>
> I see.
>
>> I have looked at a directive which allows you to preload a WSGI  
>> script
>> file, but it gets a bit complicated as you need to distinguish which
>> Python sub interpreter you need to load it into, as well as  
>> whether it
>> should be loaded into the main Apache child processes (embedded  
>> mode),
>> or a specific set of daemon processes.
>
> Perhaps making the configuration a bit more byzantine on a per-mode  
> basis but more explicit might help.  For example, if you want to  
> work in daemon mode where there are multiple processes, and each  
> process has exactly one interpreter instance, I suspect the  
> preloading become pretty trivial.  I guess the difficulty then  
> comes in supporting the other modes where preloading isn't as well  
> defined.
>
>> FWIW, using a WSGI adapter under mod_python would see the same
>> behaviour unless you used PythonImport directive of mod_python to
>> force it to preload your application on process start.
>>
>> Anyway, any mod_wsgi specific issues you perhaps should come over to
>> the mod_wsgi list and discuss.
>
> Right.  I'll take further discussion there after I have a bit more  
> experience with configuration.  Thanks very much for taking the  
> time to write this stuff here.
>
>>
>> BTW, I have been looking at a hybrid mode for mod_wsgi to add to
>> embedded and daemon modes. In hybrid mode things would work like
>> daemon mode except that instead of daemon process being just a  
>> fork of
>> Apache parent process, it would also do an exec of distinct Python
>> executable. In executing this, it would be told to load up special
>> mod_wsgi modules which implement the daemon side equivalent of daemon
>> mode, but from Python executable.
>
> That's a great idea.  It's likely I'd be able and willing to help  
> implement that, if only for the feature of being able to specify a  
> distinct Python executable for the WSGIDaemonProcess directive.  I  
> will take that discussion to the mod_wsgi list.
>
>> per Ian Bicking's latest proposals as to how that could work. In  
>> other
>> words, in hybrid mode you wouldn't be running a Python instance
>> embedded in the Apache processes, thus it would be more like true
>> fastcgi solutions which actually exec a separate Python instance. The
>> intent though is to make it pretty seemless so that your code and how
>> you configure things in your code is the same for all three mod_wsgi
>> modes.
>
> I understand this desire.
>
> Thanks again!
>
> - C
>

_______________________________________________
Repoze-dev mailing list
Repoze-dev at lists.repoze.org
http://lists.repoze.org/mailman/listinfo/repoze-dev



More information about the Repoze-dev mailing list