| 1 | # (c) 2005 Ian Bicking and contributors; written for Paste (http://pythonpaste.org) |
|---|
| 2 | # Licensed under the MIT license: http://www.opensource.org/licenses/mit-license.php |
|---|
| 3 | # (c) 2005 Ian Bicking, Clark C. Evans and contributors |
|---|
| 4 | # This module is part of the Python Paste Project and is released under |
|---|
| 5 | # the MIT License: http://www.opensource.org/licenses/mit-license.php |
|---|
| 6 | # Some of this code was funded by: http://prometheusresearch.com |
|---|
| 7 | """ |
|---|
| 8 | HTTP Message Header Fields (see RFC 4229) |
|---|
| 9 | |
|---|
| 10 | This contains general support for HTTP/1.1 message headers [1]_ in a |
|---|
| 11 | manner that supports WSGI ``environ`` [2]_ and ``response_headers`` |
|---|
| 12 | [3]_. Specifically, this module defines a ``HTTPHeader`` class whose |
|---|
| 13 | instances correspond to field-name items. The actual field-content for |
|---|
| 14 | the message-header is stored in the appropriate WSGI collection (either |
|---|
| 15 | the ``environ`` for requests, or ``response_headers`` for responses). |
|---|
| 16 | |
|---|
| 17 | Each ``HTTPHeader`` instance is a callable (defining ``__call__``) |
|---|
| 18 | that takes one of the following: |
|---|
| 19 | |
|---|
| 20 | - an ``environ`` dictionary, returning the corresponding header |
|---|
| 21 | value by according to the WSGI's ``HTTP_`` prefix mechanism, e.g., |
|---|
| 22 | ``USER_AGENT(environ)`` returns ``environ.get('HTTP_USER_AGENT')`` |
|---|
| 23 | |
|---|
| 24 | - a ``response_headers`` list, giving a comma-delimited string for |
|---|
| 25 | each corresponding ``header_value`` tuple entries (see below). |
|---|
| 26 | |
|---|
| 27 | - a sequence of string ``*args`` that are comma-delimited into |
|---|
| 28 | a single string value: ``CONTENT_TYPE("text/html","text/plain")`` |
|---|
| 29 | returns ``"text/html, text/plain"`` |
|---|
| 30 | |
|---|
| 31 | - a set of ``**kwargs`` keyword arguments that are used to create |
|---|
| 32 | a header value, in a manner dependent upon the particular header in |
|---|
| 33 | question (to make value construction easier and error-free): |
|---|
| 34 | ``CONTENT_DISPOSITION(max_age=CONTENT_DISPOSITION.ONEWEEK)`` |
|---|
| 35 | returns ``"public, max-age=60480"`` |
|---|
| 36 | |
|---|
| 37 | Each ``HTTPHeader`` instance also provides several methods to act on |
|---|
| 38 | a WSGI collection, for removing and setting header values. |
|---|
| 39 | |
|---|
| 40 | ``delete(collection)`` |
|---|
| 41 | |
|---|
| 42 | This method removes all entries of the corresponding header from |
|---|
| 43 | the given collection (``environ`` or ``response_headers``), e.g., |
|---|
| 44 | ``USER_AGENT.remove(environ)`` deletes the 'HTTP_USER_AGENT' entry |
|---|
| 45 | from the ``environ``. |
|---|
| 46 | |
|---|
| 47 | ``update(collection, *args, **kwargs)`` |
|---|
| 48 | |
|---|
| 49 | This method does an in-place replacement of the given header entry, |
|---|
| 50 | for example: ``CONTENT_LENGTH(response_headers,len(body))`` |
|---|
| 51 | |
|---|
| 52 | The first argument is a valid ``environ`` dictionary or |
|---|
| 53 | ``response_headers`` list; remaining arguments are passed on to |
|---|
| 54 | ``__call__(*args, **kwargs)`` for value construction. |
|---|
| 55 | |
|---|
| 56 | ``apply(collection, **kwargs)`` |
|---|
| 57 | |
|---|
| 58 | This method is similar to update, only that it may affect other |
|---|
| 59 | headers. For example, according to recommendations in RFC 2616, |
|---|
| 60 | certain Cache-Control configurations should also set the |
|---|
| 61 | ``Expires`` header for HTTP/1.0 clients. By default, ``apply()`` |
|---|
| 62 | is simply ``update()`` but limited to keyword arguments. |
|---|
| 63 | |
|---|
| 64 | This particular approach to managing headers within a WSGI collection |
|---|
| 65 | has several advantages: |
|---|
| 66 | |
|---|
| 67 | 1. Typos in the header name are easily detected since they become a |
|---|
| 68 | ``NameError`` when executed. The approach of using header strings |
|---|
| 69 | directly can be problematic; for example, the following should |
|---|
| 70 | return ``None`` : ``environ.get("HTTP_ACCEPT_LANGUAGES")`` |
|---|
| 71 | |
|---|
| 72 | 2. For specific headers with validation, using ``__call__`` will |
|---|
| 73 | result in an automatic header value check. For example, the |
|---|
| 74 | _ContentDisposition header will reject a value having ``maxage`` |
|---|
| 75 | or ``max_age`` (the appropriate parameter is ``max-age`` ). |
|---|
| 76 | |
|---|
| 77 | 3. When appending/replacing headers, the field-name has the suggested |
|---|
| 78 | RFC capitalization (e.g. ``Content-Type`` or ``ETag``) for |
|---|
| 79 | user-agents that incorrectly use case-sensitive matches. |
|---|
| 80 | |
|---|
| 81 | 4. Some headers (such as ``Content-Type``) are 0, that is, |
|---|
| 82 | only one entry of this type may occur in a given set of |
|---|
| 83 | ``response_headers``. This module knows about those cases and |
|---|
| 84 | enforces this cardinality constraint. |
|---|
| 85 | |
|---|
| 86 | 5. The exact details of WSGI header management are abstracted so |
|---|
| 87 | the programmer need not worry about operational differences |
|---|
| 88 | between ``environ`` dictionary or ``response_headers`` list. |
|---|
| 89 | |
|---|
| 90 | 6. Sorting of ``HTTPHeaders`` is done following the RFC suggestion |
|---|
| 91 | that general-headers come first, followed by request and response |
|---|
| 92 | headers, and finishing with entity-headers. |
|---|
| 93 | |
|---|
| 94 | 7. Special care is given to exceptional cases such as Set-Cookie |
|---|
| 95 | which violates the RFC's recommendation about combining header |
|---|
| 96 | content into a single entry using comma separation. |
|---|
| 97 | |
|---|
| 98 | A particular difficulty with HTTP message headers is a categorization |
|---|
| 99 | of sorts as described in section 4.2: |
|---|
| 100 | |
|---|
| 101 | Multiple message-header fields with the same field-name MAY be |
|---|
| 102 | present in a message if and only if the entire field-value for |
|---|
| 103 | that header field is defined as a comma-separated list [i.e., |
|---|
| 104 | #(values)]. It MUST be possible to combine the multiple header |
|---|
| 105 | fields into one "field-name: field-value" pair, without changing |
|---|
| 106 | the semantics of the message, by appending each subsequent |
|---|
| 107 | field-value to the first, each separated by a comma. |
|---|
| 108 | |
|---|
| 109 | This creates three fundamentally different kinds of headers: |
|---|
| 110 | |
|---|
| 111 | - Those that do not have a #(values) production, and hence are |
|---|
| 112 | singular and may only occur once in a set of response fields; |
|---|
| 113 | this case is handled by the ``_SingleValueHeader`` subclass. |
|---|
| 114 | |
|---|
| 115 | - Those which have the #(values) production and follow the |
|---|
| 116 | combining rule outlined above; our ``_MultiValueHeader`` case. |
|---|
| 117 | |
|---|
| 118 | - Those which are multi-valued, but cannot be combined (such as the |
|---|
| 119 | ``Set-Cookie`` header due to its ``Expires`` parameter); or where |
|---|
| 120 | combining them into a single header entry would cause common |
|---|
| 121 | user-agents to fail (``WWW-Authenticate``, ``Warning``) since |
|---|
| 122 | they fail to handle dates even when properly quoted. This case |
|---|
| 123 | is handled by ``_MultiEntryHeader``. |
|---|
| 124 | |
|---|
| 125 | Since this project does not have time to provide rigorous support |
|---|
| 126 | and validation for all headers, it does a basic construction of |
|---|
| 127 | headers listed in RFC 2616 (plus a few others) so that they can |
|---|
| 128 | be obtained by simply doing ``from paste.httpheaders import *``; |
|---|
| 129 | the name of the header instance is the "common name" less any |
|---|
| 130 | dashes to give CamelCase style names. |
|---|
| 131 | |
|---|
| 132 | .. [1] http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.2 |
|---|
| 133 | .. [2] http://www.python.org/peps/pep-0333.html#environ-variables |
|---|
| 134 | .. [3] http://www.python.org/peps/pep-0333.html#the-start-response-callable |
|---|
| 135 | |
|---|
| 136 | """ |
|---|
| 137 | import urllib2 |
|---|
| 138 | import re |
|---|
| 139 | from mimetypes import guess_type |
|---|
| 140 | from rfc822 import formatdate, parsedate_tz, mktime_tz |
|---|
| 141 | from time import time as now |
|---|
| 142 | from httpexceptions import HTTPBadRequest |
|---|
| 143 | |
|---|
| 144 | __all__ = ['get_header', 'list_headers', 'normalize_headers', |
|---|
| 145 | 'HTTPHeader', 'EnvironVariable' ] |
|---|
| 146 | |
|---|
| 147 | class EnvironVariable(str): |
|---|
| 148 | """ |
|---|
| 149 | a CGI ``environ`` variable as described by WSGI |
|---|
| 150 | |
|---|
| 151 | This is a helper object so that standard WSGI ``environ`` variables |
|---|
| 152 | can be extracted w/o syntax error possibility. |
|---|
| 153 | """ |
|---|
| 154 | def __call__(self, environ): |
|---|
| 155 | return environ.get(self,'') |
|---|
| 156 | def __repr__(self): |
|---|
| 157 | return '<EnvironVariable %s>' % self |
|---|
| 158 | def update(self, environ, value): |
|---|
| 159 | environ[self] = value |
|---|
| 160 | REMOTE_USER = EnvironVariable("REMOTE_USER") |
|---|
| 161 | REMOTE_SESSION = EnvironVariable("REMOTE_SESSION") |
|---|
| 162 | AUTH_TYPE = EnvironVariable("AUTH_TYPE") |
|---|
| 163 | REQUEST_METHOD = EnvironVariable("REQUEST_METHOD") |
|---|
| 164 | SCRIPT_NAME = EnvironVariable("SCRIPT_NAME") |
|---|
| 165 | PATH_INFO = EnvironVariable("PATH_INFO") |
|---|
| 166 | |
|---|
| 167 | for _name, _obj in globals().items(): |
|---|
| 168 | if isinstance(_obj, EnvironVariable): |
|---|
| 169 | __all__.append(_name) |
|---|
| 170 | |
|---|
| 171 | _headers = {} |
|---|
| 172 | |
|---|
| 173 | class HTTPHeader(object): |
|---|
| 174 | """ |
|---|
| 175 | an HTTP header |
|---|
| 176 | |
|---|
| 177 | HTTPHeader instances represent a particular ``field-name`` of an |
|---|
| 178 | HTTP message header. They do not hold a field-value, but instead |
|---|
| 179 | provide operations that work on is corresponding values. Storage |
|---|
| 180 | of the actual field values is done with WSGI ``environ`` or |
|---|
| 181 | ``response_headers`` as appropriate. Typically, a sub-classes that |
|---|
| 182 | represent a specific HTTP header, such as _ContentDisposition, are |
|---|
| 183 | 0. Once constructed the HTTPHeader instances themselves |
|---|
| 184 | are immutable and stateless. |
|---|
| 185 | |
|---|
| 186 | For purposes of documentation a "container" refers to either a |
|---|
| 187 | WSGI ``environ`` dictionary, or a ``response_headers`` list. |
|---|
| 188 | |
|---|
| 189 | Member variables (and correspondingly constructor arguments). |
|---|
| 190 | |
|---|
| 191 | ``name`` |
|---|
| 192 | |
|---|
| 193 | the ``field-name`` of the header, in "common form" |
|---|
| 194 | as presented in RFC 2616; e.g. 'Content-Type' |
|---|
| 195 | |
|---|
| 196 | ``category`` |
|---|
| 197 | |
|---|
| 198 | one of 'general', 'request', 'response', or 'entity' |
|---|
| 199 | |
|---|
| 200 | ``version`` |
|---|
| 201 | |
|---|
| 202 | version of HTTP (informational) with which the header should |
|---|
| 203 | be recognized |
|---|
| 204 | |
|---|
| 205 | ``sort_order`` |
|---|
| 206 | |
|---|
| 207 | sorting order to be applied before sorting on |
|---|
| 208 | field-name when ordering headers in a response |
|---|
| 209 | |
|---|
| 210 | Special Methods: |
|---|
| 211 | |
|---|
| 212 | ``__call__`` |
|---|
| 213 | |
|---|
| 214 | The primary method of the HTTPHeader instance is to make |
|---|
| 215 | it a callable, it takes either a collection, a string value, |
|---|
| 216 | or keyword arguments and attempts to find/construct a valid |
|---|
| 217 | field-value |
|---|
| 218 | |
|---|
| 219 | ``__lt__`` |
|---|
| 220 | |
|---|
| 221 | This method is used so that HTTPHeader objects can be |
|---|
| 222 | sorted in a manner suggested by RFC 2616. |
|---|
| 223 | |
|---|
| 224 | ``__str__`` |
|---|
| 225 | |
|---|
| 226 | The string-value for instances of this class is |
|---|
| 227 | the ``field-name``. |
|---|
| 228 | |
|---|
| 229 | Primary Methods: |
|---|
| 230 | |
|---|
| 231 | ``delete()`` |
|---|
| 232 | |
|---|
| 233 | remove the all occurrences (if any) of the given |
|---|
| 234 | header in the collection provided |
|---|
| 235 | |
|---|
| 236 | ``update()`` |
|---|
| 237 | |
|---|
| 238 | replaces (if they exist) all field-value items |
|---|
| 239 | in the given collection with the value provided |
|---|
| 240 | |
|---|
| 241 | ``tuples()`` |
|---|
| 242 | |
|---|
| 243 | returns a set of (field-name, field-value) tuples |
|---|
| 244 | 5 for extending ``response_headers`` |
|---|
| 245 | |
|---|
| 246 | Custom Methods (these may not be implemented): |
|---|
| 247 | |
|---|
| 248 | ``apply()`` |
|---|
| 249 | |
|---|
| 250 | similar to ``update``, but with two differences; first, |
|---|
| 251 | only keyword arguments can be used, and second, specific |
|---|
| 252 | sub-classes may introduce side-effects |
|---|
| 253 | |
|---|
| 254 | ``parse()`` |
|---|
| 255 | |
|---|
| 256 | converts a string value of the header into a more usable |
|---|
| 257 | form, such as time in seconds for a date header, etc. |
|---|
| 258 | |
|---|
| 259 | The collected versions of initialized header instances are immediately |
|---|
| 260 | registered and accessible through the ``get_header`` function. Do not |
|---|
| 261 | inherit from this directly, use one of ``_SingleValueHeader``, |
|---|
| 262 | ``_MultiValueHeader``, or ``_MultiEntryHeader`` as appropriate. |
|---|
| 263 | """ |
|---|
| 264 | |
|---|
| 265 | # |
|---|
| 266 | # Things which can be customized |
|---|
| 267 | # |
|---|
| 268 | version = '1.1' |
|---|
| 269 | category = 'general' |
|---|
| 270 | reference = '' |
|---|
| 271 | extensions = {} |
|---|
| 272 | |
|---|
| 273 | def compose(self, **kwargs): |
|---|
| 274 | """ |
|---|
| 275 | build header value from keyword arguments |
|---|
| 276 | |
|---|
| 277 | This method is used to build the corresponding header value when |
|---|
| 278 | keyword arguments (or no arguments) were provided. The result |
|---|
| 279 | should be a sequence of values. For example, the ``Expires`` |
|---|
| 280 | header takes a keyword argument ``time`` (e.g. time.time()) from |
|---|
| 281 | which it returns a the corresponding date. |
|---|
| 282 | """ |
|---|
| 283 | raise NotImplementedError() |
|---|
| 284 | |
|---|
| 285 | def parse(self, *args, **kwargs): |
|---|
| 286 | """ |
|---|
| 287 | convert raw header value into more usable form |
|---|
| 288 | |
|---|
| 289 | This method invokes ``values()`` with the arguments provided, |
|---|
| 290 | parses the header results, and then returns a header-specific |
|---|
| 291 | data structure corresponding to the header. For example, the |
|---|
| 292 | ``Expires`` header returns seconds (as returned by time.time()) |
|---|
| 293 | """ |
|---|
| 294 | raise NotImplementedError() |
|---|
| 295 | |
|---|
| 296 | def apply(self, collection, **kwargs): |
|---|
| 297 | """ |
|---|
| 298 | update the collection /w header value (may have side effects) |
|---|
| 299 | |
|---|
| 300 | This method is similar to ``update`` only that usage may result |
|---|
| 301 | in other headers being changed as recommended by the corresponding |
|---|
| 302 | specification. The return value is defined by the particular |
|---|
| 303 | sub-class. For example, the ``_CacheControl.apply()`` sets the |
|---|
| 304 | ``Expires`` header in addition to its normal behavior. |
|---|
| 305 | """ |
|---|
| 306 | self.update(collection, **kwargs) |
|---|
| 307 | |
|---|
| 308 | # |
|---|
| 309 | # Things which are standardized (mostly) |
|---|
| 310 | # |
|---|
| 311 | def __new__(cls, name, category=None, reference=None, version=None): |
|---|
| 312 | """ |
|---|
| 313 | construct a new ``HTTPHeader`` instance |
|---|
| 314 | |
|---|
| 315 | We use the ``__new__`` operator to ensure that only one |
|---|
| 316 | ``HTTPHeader`` instance exists for each field-name, and to |
|---|
| 317 | register the header so that it can be found/enumerated. |
|---|
| 318 | """ |
|---|
| 319 | self = get_header(name, raiseError=False) |
|---|
| 320 | if self: |
|---|
| 321 | # Allow the registration to happen again, but assert |
|---|
| 322 | # that everything is identical. |
|---|
| 323 | assert self.name == name, \ |
|---|
| 324 | "duplicate registration with different capitalization" |
|---|
| 325 | assert self.category == category, \ |
|---|
| 326 | "duplicate registration with different category" |
|---|
| 327 | assert cls == self.__class__, \ |
|---|
| 328 | "duplicate registration with different class" |
|---|
| 329 | return self |
|---|
| 330 | |
|---|
| 331 | self = object.__new__(cls) |
|---|
| 332 | self.name = name |
|---|
| 333 | assert isinstance(self.name, str) |
|---|
| 334 | self.category = category or self.category |
|---|
| 335 | self.version = version or self.version |
|---|
| 336 | self.reference = reference or self.reference |
|---|
| 337 | _headers[self.name.lower()] = self |
|---|
| 338 | self.sort_order = {'general': 1, 'request': 2, |
|---|
| 339 | 'response': 3, 'entity': 4 }[self.category] |
|---|
| 340 | self._environ_name = getattr(self, '_environ_name', |
|---|
| 341 | 'HTTP_'+ self.name.upper().replace("-","_")) |
|---|
| 342 | self._headers_name = getattr(self, '_headers_name', |
|---|
| 343 | self.name.lower()) |
|---|
| 344 | assert self.version in ('1.1', '1.0', '0.9') |
|---|
| 345 | return self |
|---|
| 346 | |
|---|
| 347 | def __str__(self): |
|---|
| 348 | return self.name |
|---|
| 349 | |
|---|
| 350 | def __lt__(self, other): |
|---|
| 351 | """ |
|---|
| 352 | sort header instances as specified by RFC 2616 |
|---|
| 353 | |
|---|
| 354 | Re-define sorting so that general headers are first, followed |
|---|
| 355 | by request/response headers, and then entity headers. The |
|---|
| 356 | list.sort() methods use the less-than operator for this purpose. |
|---|
| 357 | """ |
|---|
| 358 | if isinstance(other, HTTPHeader): |
|---|
| 359 | if self.sort_order != other.sort_order: |
|---|
| 360 | return self.sort_order < other.sort_order |
|---|
| 361 | return self.name < other.name |
|---|
| 362 | return False |
|---|
| 363 | |
|---|
| 364 | def __repr__(self): |
|---|
| 365 | ref = self.reference and (' (%s)' % self.reference) or '' |
|---|
| 366 | return '<%s %s%s>' % (self.__class__.__name__, self.name, ref) |
|---|
| 367 | |
|---|
| 368 | def values(self, *args, **kwargs): |
|---|
| 369 | """ |
|---|
| 370 | find/construct field-value(s) for the given header |
|---|
| 371 | |
|---|
| 372 | Resolution is done according to the following arguments: |
|---|
| 373 | |
|---|
| 374 | - If only keyword arguments are given, then this is equivalent |
|---|
| 375 | to ``compose(**kwargs)``. |
|---|
| 376 | |
|---|
| 377 | - If the first (and only) argument is a dict, it is assumed |
|---|
| 378 | to be a WSGI ``environ`` and the result of the corresponding |
|---|
| 379 | ``HTTP_`` entry is returned. |
|---|
| 380 | |
|---|
| 381 | - If the first (and only) argument is a list, it is assumed |
|---|
| 382 | to be a WSGI ``response_headers`` and the field-value(s) |
|---|
| 383 | for this header are collected and returned. |
|---|
| 384 | |
|---|
| 385 | - In all other cases, the arguments are collected, checked that |
|---|
| 386 | they are string values, possibly verified by the header's |
|---|
| 387 | logic, and returned. |
|---|
| 388 | |
|---|
| 389 | At this time it is an error to provide keyword arguments if args |
|---|
| 390 | is present (this might change). It is an error to provide both |
|---|
| 391 | a WSGI object and also string arguments. If no arguments are |
|---|
| 392 | provided, then ``compose()`` is called to provide a default |
|---|
| 393 | value for the header; if there is not default it is an error. |
|---|
| 394 | """ |
|---|
| 395 | if not args: |
|---|
| 396 | return self.compose(**kwargs) |
|---|
| 397 | if list == type(args[0]): |
|---|
| 398 | assert 1 == len(args) |
|---|
| 399 | result = [] |
|---|
| 400 | name = self.name.lower() |
|---|
| 401 | for value in [value for header, value in args[0] |
|---|
| 402 | if header.lower() == name]: |
|---|
| 403 | result.append(value) |
|---|
| 404 | return result |
|---|
| 405 | if dict == type(args[0]): |
|---|
| 406 | assert 1 == len(args) and 'wsgi.version' in args[0] |
|---|
| 407 | value = args[0].get(self._environ_name) |
|---|
| 408 | if not value: |
|---|
| 409 | return () |
|---|
| 410 | return (value,) |
|---|
| 411 | for item in args: |
|---|
| 412 | assert not type(item) in (dict, list) |
|---|
| 413 | return args |
|---|
| 414 | |
|---|
| 415 | def __call__(self, *args, **kwargs): |
|---|
| 416 | """ |
|---|
| 417 | converts ``values()`` into a string value |
|---|
| 418 | |
|---|
| 419 | This method converts the results of ``values()`` into a string |
|---|
| 420 | value for common usage. By default, it is asserted that only |
|---|
| 421 | one value exists; if you need to access all values then either |
|---|
| 422 | call ``values()`` directly, or inherit ``_MultiValueHeader`` |
|---|
| 423 | which overrides this method to return a comma separated list of |
|---|
| 424 | values as described by section 4.2 of RFC 2616. |
|---|
| 425 | """ |
|---|
| 426 | values = self.values(*args, **kwargs) |
|---|
| 427 | assert isinstance(values, (tuple, list)) |
|---|
| 428 | if not values: |
|---|
| 429 | return '' |
|---|
| 430 | assert len(values) == 1, "more than one value: %s" % repr(values) |
|---|
| 431 | return str(values[0]).strip() |
|---|
| 432 | |
|---|
| 433 | def delete(self, collection): |
|---|
| 434 | """ |
|---|
| 435 | removes all occurances of the header from the collection provided |
|---|
| 436 | """ |
|---|
| 437 | if type(collection) == dict: |
|---|
| 438 | if self._environ_name in collection: |
|---|
| 439 | del collection[self._environ_name] |
|---|
| 440 | return self |
|---|
| 441 | assert list == type(collection) |
|---|
| 442 | i = 0 |
|---|
| 443 | while i < len(collection): |
|---|
| 444 | if collection[i][0].lower() == self._headers_name: |
|---|
| 445 | del collection[i] |
|---|
| 446 | continue |
|---|
| 447 | i += 1 |
|---|
| 448 | |
|---|
| 449 | def update(self, collection, *args, **kwargs): |
|---|
| 450 | """ |
|---|
| 451 | updates the collection with the provided header value |
|---|
| 452 | |
|---|
| 453 | This method replaces (in-place when possible) all occurrences of |
|---|
| 454 | the given header with the provided value. If no value is |
|---|
| 455 | provided, this is the same as ``remove`` (note that this case |
|---|
| 456 | can only occur if the target is a collection w/o a corresponding |
|---|
| 457 | header value). The return value is the new header value (which |
|---|
| 458 | could be a list for ``_MultiEntryHeader`` instances). |
|---|
| 459 | """ |
|---|
| 460 | value = self.__call__(*args, **kwargs) |
|---|
| 461 | if not value: |
|---|
| 462 | self.remove(collection) |
|---|
| 463 | return |
|---|
| 464 | if type(collection) == dict: |
|---|
| 465 | collection[self._environ_name] = value |
|---|
| 466 | return |
|---|
| 467 | assert list == type(collection) |
|---|
| 468 | i = 0 |
|---|
| 469 | found = False |
|---|
| 470 | while i < len(collection): |
|---|
| 471 | if collection[i][0].lower() == self._headers_name: |
|---|
| 472 | if found: |
|---|
| 473 | del collection[i] |
|---|
| 474 | continue |
|---|
| 475 | collection[i] = (self.name, value) |
|---|
| 476 | found = True |
|---|
| 477 | i += 1 |
|---|
| 478 | if not found: |
|---|
| 479 | collection.append((self.name, value)) |
|---|
| 480 | |
|---|
| 481 | def tuples(self, *args, **kwargs): |
|---|
| 482 | value = self.__call__(*args, **kwargs) |
|---|
| 483 | if not value: |
|---|
| 484 | return () |
|---|
| 485 | return [(self.name, value)] |
|---|
| 486 | |
|---|
| 487 | class _SingleValueHeader(HTTPHeader): |
|---|
| 488 | """ |
|---|
| 489 | a ``HTTPHeader`` with exactly a single value |
|---|
| 490 | |
|---|
| 491 | This is the default behavior of ``HTTPHeader`` where returning a |
|---|
| 492 | the string-value of headers via ``__call__`` assumes that only |
|---|
| 493 | a single value exists. |
|---|
| 494 | """ |
|---|
| 495 | pass |
|---|
| 496 | |
|---|
| 497 | class _MultiValueHeader(HTTPHeader): |
|---|
| 498 | """ |
|---|
| 499 | a ``HTTPHeader`` with one or more values |
|---|
| 500 | |
|---|
| 501 | The field-value for these header instances is is allowed to be more |
|---|
| 502 | than one value; whereby the ``__call__`` method returns a comma |
|---|
| 503 | separated list as described by section 4.2 of RFC 2616. |
|---|
| 504 | """ |
|---|
| 505 | |
|---|
| 506 | def __call__(self, *args, **kwargs): |
|---|
| 507 | results = self.values(*args, **kwargs) |
|---|
| 508 | if not results: |
|---|
| 509 | return '' |
|---|
| 510 | return ", ".join([str(v).strip() for v in results]) |
|---|
| 511 | |
|---|
| 512 | def parse(self, *args, **kwargs): |
|---|
| 513 | value = self.__call__(*args, **kwargs) |
|---|
| 514 | values = value.split(',') |
|---|
| 515 | return [ |
|---|
| 516 | v.strip() for v in values |
|---|
| 517 | if v.strip()] |
|---|
| 518 | |
|---|
| 519 | class _MultiEntryHeader(HTTPHeader): |
|---|
| 520 | """ |
|---|
| 521 | a multi-value ``HTTPHeader`` where items cannot be combined with a comma |
|---|
| 522 | |
|---|
| 523 | This header is multi-valued, but the values should not be combined |
|---|
| 524 | with a comma since the header is not in compliance with RFC 2616 |
|---|
| 525 | (Set-Cookie due to Expires parameter) or which common user-agents do |
|---|
| 526 | not behave well when the header values are combined. |
|---|
| 527 | """ |
|---|
| 528 | |
|---|
| 529 | def update(self, collection, *args, **kwargs): |
|---|
| 530 | assert list == type(collection), "``environ`` may not be updated" |
|---|
| 531 | self.delete(collection) |
|---|
| 532 | collection.extend(self.tuples(*args, **kwargs)) |
|---|
| 533 | |
|---|
| 534 | def tuples(self, *args, **kwargs): |
|---|
| 535 | values = self.values(*args, **kwargs) |
|---|
| 536 | if not values: |
|---|
| 537 | return () |
|---|
| 538 | return [(self.name, value.strip()) for value in values] |
|---|
| 539 | |
|---|
| 540 | def get_header(name, raiseError=True): |
|---|
| 541 | """ |
|---|
| 542 | find the given ``HTTPHeader`` instance |
|---|
| 543 | |
|---|
| 544 | This function finds the corresponding ``HTTPHeader`` for the |
|---|
| 545 | ``name`` provided. So that python-style names can be used, |
|---|
| 546 | underscores are converted to dashes before the lookup. |
|---|
| 547 | """ |
|---|
| 548 | retval = _headers.get(str(name).strip().lower().replace("_","-")) |
|---|
| 549 | if not retval and raiseError: |
|---|
| 550 | raise AssertionError("'%s' is an unknown header" % name) |
|---|
| 551 | return retval |
|---|
| 552 | |
|---|
| 553 | def list_headers(general=None, request=None, response=None, entity=None): |
|---|
| 554 | " list all headers for a given category " |
|---|
| 555 | if not (general or request or response or entity): |
|---|
| 556 | general = request = response = entity = True |
|---|
| 557 | search = [] |
|---|
| 558 | for (bool, strval) in ((general, 'general'), (request, 'request'), |
|---|
| 559 | (response, 'response'), (entity, 'entity')): |
|---|
| 560 | if bool: |
|---|
| 561 | search.append(strval) |
|---|
| 562 | return [head for head in _headers.values() if head.category in search] |
|---|
| 563 | |
|---|
| 564 | def normalize_headers(response_headers, strict=True): |
|---|
| 565 | """ |
|---|
| 566 | sort headers as suggested by RFC 2616 |
|---|
| 567 | |
|---|
| 568 | This alters the underlying response_headers to use the common |
|---|
| 569 | name for each header; as well as sorting them with general |
|---|
| 570 | headers first, followed by request/response headers, then |
|---|
| 571 | entity headers, and unknown headers last. |
|---|
| 572 | """ |
|---|
| 573 | category = {} |
|---|
| 574 | for idx in range(len(response_headers)): |
|---|
| 575 | (key, val) = response_headers[idx] |
|---|
| 576 | head = get_header(key, strict) |
|---|
| 577 | if not head: |
|---|
| 578 | newhead = '-'.join([x.capitalize() for x in |
|---|
| 579 | key.replace("_","-").split("-")]) |
|---|
| 580 | response_headers[idx] = (newhead, val) |
|---|
| 581 | category[newhead] = 4 |
|---|
| 582 | continue |
|---|
| 583 | response_headers[idx] = (str(head), val) |
|---|
| 584 | category[str(head)] = head.sort_order |
|---|
| 585 | def compare(a, b): |
|---|
| 586 | ac = category[a[0]] |
|---|
| 587 | bc = category[b[0]] |
|---|
| 588 | if ac == bc: |
|---|
| 589 | return cmp(a[0], b[0]) |
|---|
| 590 | return cmp(ac, bc) |
|---|
| 591 | response_headers.sort(compare) |
|---|
| 592 | |
|---|
| 593 | class _DateHeader(_SingleValueHeader): |
|---|
| 594 | """ |
|---|
| 595 | handle date-based headers |
|---|
| 596 | |
|---|
| 597 | This extends the ``_SingleValueHeader`` object with specific |
|---|
| 598 | treatment of time values: |
|---|
| 599 | |
|---|
| 600 | - It overrides ``compose`` to provide a sole keyword argument |
|---|
| 601 | ``time`` which is an offset in seconds from the current time. |
|---|
| 602 | |
|---|
| 603 | - A ``time`` method is provided which parses the given value |
|---|
| 604 | and returns the current time value. |
|---|
| 605 | """ |
|---|
| 606 | |
|---|
| 607 | def compose(self, time=None, delta=None): |
|---|
| 608 | time = time or now() |
|---|
| 609 | if delta: |
|---|
| 610 | assert type(delta) == int |
|---|
| 611 | time += delta |
|---|
| 612 | return (formatdate(time),) |
|---|
| 613 | |
|---|
| 614 | def parse(self, *args, **kwargs): |
|---|
| 615 | """ return the time value (in seconds since 1970) """ |
|---|
| 616 | value = self.__call__(*args, **kwargs) |
|---|
| 617 | if value: |
|---|
| 618 | try: |
|---|
| 619 | return mktime_tz(parsedate_tz(value)) |
|---|
| 620 | except TypeError: |
|---|
| 621 | raise HTTPBadRequest(( |
|---|
| 622 | "Received an ill-formed timestamp for %s: %s\r\n") % |
|---|
| 623 | (self.name, value)) |
|---|
| 624 | |
|---|
| 625 | # |
|---|
| 626 | # Following are specific HTTP headers. Since these classes are mostly |
|---|
| 627 | # singletons, there is no point in keeping the class around once it has |
|---|
| 628 | # been instantiated, so we use the same name. |
|---|
| 629 | # |
|---|
| 630 | |
|---|
| 631 | class _CacheControl(_MultiValueHeader): |
|---|
| 632 | """ |
|---|
| 633 | Cache-Control, RFC 2616 14.9 (use ``CACHE_CONTROL``) |
|---|
| 634 | |
|---|
| 635 | This header can be constructed (using keyword arguments), by |
|---|
| 636 | first specifying one of the following mechanisms: |
|---|
| 637 | |
|---|
| 638 | ``public`` |
|---|
| 639 | |
|---|
| 640 | if True, this argument specifies that the |
|---|
| 641 | response, as a whole, may be cashed. |
|---|
| 642 | |
|---|
| 643 | ``private`` |
|---|
| 644 | |
|---|
| 645 | if True, this argument specifies that the response, as a |
|---|
| 646 | whole, may be cashed; this implementation does not support |
|---|
| 647 | the enumeration of private fields |
|---|
| 648 | |
|---|
| 649 | ``no_cache`` |
|---|
| 650 | |
|---|
| 651 | if True, this argument specifies that the response, as a |
|---|
| 652 | whole, may not be cashed; this implementation does not |
|---|
| 653 | support the enumeration of private fields |
|---|
| 654 | |
|---|
| 655 | In general, only one of the above three may be True, the other 2 |
|---|
| 656 | must then be False or None. If all three are None, then the cache |
|---|
| 657 | is assumed to be ``public``. Following one of these mechanism |
|---|
| 658 | specifiers are various modifiers: |
|---|
| 659 | |
|---|
| 660 | ``no_store`` |
|---|
| 661 | |
|---|
| 662 | indicates if content may be stored on disk; |
|---|
| 663 | otherwise cache is limited to memory (note: |
|---|
| 664 | users can still save the data, this applies |
|---|
| 665 | to intermediate caches) |
|---|
| 666 | |
|---|
| 667 | ``max_age`` |
|---|
| 668 | |
|---|
| 669 | the maximum duration (in seconds) for which |
|---|
| 670 | the content should be cached; if ``no-cache`` |
|---|
| 671 | is specified, this defaults to 0 seconds |
|---|
| 672 | |
|---|
| 673 | ``s_maxage`` |
|---|
| 674 | |
|---|
| 675 | the maximum duration (in seconds) for which the |
|---|
| 676 | content should be allowed in a shared cache. |
|---|
| 677 | |
|---|
| 678 | ``no_transform`` |
|---|
| 679 | |
|---|
| 680 | specifies that an intermediate cache should |
|---|
| 681 | not convert the content from one type to |
|---|
| 682 | another (e.g. transform a BMP to a PNG). |
|---|
| 683 | |
|---|
| 684 | ``extensions`` |
|---|
| 685 | |
|---|
| 686 | gives additional cache-control extensions, |
|---|
| 687 | such as items like, community="UCI" (14.9.6) |
|---|
| 688 | |
|---|
| 689 | The usage of ``apply()`` on this header has side-effects. As |
|---|
| 690 | recommended by RFC 2616, if ``max_age`` is provided, then then the |
|---|
| 691 | ``Expires`` header is also calculated for HTTP/1.0 clients and |
|---|
| 692 | proxies (this is done at the time ``apply()`` is called). For |
|---|
| 693 | ``no-cache`` and for ``private`` cases, we either do not want the |
|---|
| 694 | response cached or do not want any response accidently returned to |
|---|
| 695 | other users; so to prevent this case, we set the ``Expires`` header |
|---|
| 696 | to the time of the request, signifying to HTTP/1.0 transports that |
|---|
| 697 | the content isn't to be cached. If you are using SSL, your |
|---|
| 698 | communication is already "private", so to work with HTTP/1.0 |
|---|
| 699 | browsers over SSL, consider specifying your cache as ``public`` as |
|---|
| 700 | the distinction between public and private is moot. |
|---|
| 701 | """ |
|---|
| 702 | |
|---|
| 703 | # common values for max-age; "good enough" approximates |
|---|
| 704 | ONE_HOUR = 60*60 |
|---|
| 705 | ONE_DAY = ONE_HOUR * 24 |
|---|
| 706 | ONE_WEEK = ONE_DAY * 7 |
|---|
| 707 | ONE_MONTH = ONE_DAY * 30 |
|---|
| 708 | ONE_YEAR = ONE_WEEK * 52 |
|---|
| 709 | |
|---|
| 710 | def _compose(self, public=None, private=None, no_cache=None, |
|---|
| 711 | no_store=False, max_age=None, s_maxage=None, |
|---|
| 712 | no_transform=False, **extensions): |
|---|
| 713 | assert isinstance(max_age, (type(None), int)) |
|---|
| 714 | assert isinstance(s_maxage, (type(None), int)) |
|---|
| 715 | expires = 0 |
|---|
| 716 | result = [] |
|---|
| 717 | if private is True: |
|---|
| 718 | assert not public and not no_cache and not s_maxage |
|---|
| 719 | result.append('private') |
|---|
| 720 | elif no_cache is True: |
|---|
| 721 | assert not public and not private and not max_age |
|---|
| 722 | result.append('no-cache') |
|---|
| 723 | else: |
|---|
| 724 | assert public is None or public is True |
|---|
| 725 | assert not private and not no_cache |
|---|
| 726 | expires = max_age |
|---|
| 727 | result.append('public') |
|---|
| 728 | if no_store: |
|---|
| 729 | result.append('no-store') |
|---|
| 730 | if no_transform: |
|---|
| 731 | result.append('no-transform') |
|---|
| 732 | if max_age is not None: |
|---|
| 733 | result.append('max-age=%d' % max_age) |
|---|
| 734 | if s_maxage is not None: |
|---|
| 735 | result.append('s-maxage=%d' % s_maxage) |
|---|
| 736 | for (k, v) in extensions.items(): |
|---|
| 737 | if k not in self.extensions: |
|---|
| 738 | raise AssertionError("unexpected extension used: '%s'" % k) |
|---|
| 739 | result.append('%s="%s"' % (k.replace("_", "-"), v)) |
|---|
| 740 | return (result, expires) |
|---|
| 741 | |
|---|
| 742 | def compose(self, **kwargs): |
|---|
| 743 | (result, expires) = self._compose(**kwargs) |
|---|
| 744 | return result |
|---|
| 745 | |
|---|
| 746 | def apply(self, collection, **kwargs): |
|---|
| 747 | """ returns the offset expiration in seconds """ |
|---|
| 748 | (result, expires) = self._compose(**kwargs) |
|---|
| 749 | if expires is not None: |
|---|
| 750 | EXPIRES.update(collection, delta=expires) |
|---|
| 751 | self.update(collection, *result) |
|---|
| 752 | return expires |
|---|
| 753 | |
|---|
| 754 | _CacheControl('Cache-Control', 'general', 'RFC 2616, 14.9') |
|---|
| 755 | |
|---|
| 756 | class _ContentType(_SingleValueHeader): |
|---|
| 757 | """ |
|---|
| 758 | Content-Type, RFC 2616 section 14.17 |
|---|
| 759 | |
|---|
| 760 | Unlike other headers, use the CGI variable instead. |
|---|
| 761 | """ |
|---|
| 762 | version = '1.0' |
|---|
| 763 | _environ_name = 'CONTENT_TYPE' |
|---|
| 764 | |
|---|
| 765 | # common mimetype constants |
|---|
| 766 | UNKNOWN = 'application/octet-stream' |
|---|
| 767 | TEXT_PLAIN = 'text/plain' |
|---|
| 768 | TEXT_HTML = 'text/html' |
|---|
| 769 | TEXT_XML = 'text/xml' |
|---|
| 770 | |
|---|
| 771 | def compose(self, major=None, minor=None, charset=None): |
|---|
| 772 | if not major: |
|---|
| 773 | if minor in ('plain', 'html', 'xml'): |
|---|
| 774 | major = 'text' |
|---|
| 775 | else: |
|---|
| 776 | assert not minor and not charset |
|---|
| 777 | return (self.UNKNOWN,) |
|---|
| 778 | if not minor: |
|---|
| 779 | minor = "*" |
|---|
| 780 | result = "%s/%s" % (major, minor) |
|---|
| 781 | if charset: |
|---|
| 782 | result += "; charset=%s" % charset |
|---|
| 783 | return (result,) |
|---|
| 784 | |
|---|
| 785 | _ContentType('Content-Type', 'entity', 'RFC 2616, 14.17') |
|---|
| 786 | |
|---|
| 787 | class _ContentLength(_SingleValueHeader): |
|---|
| 788 | """ |
|---|
| 789 | Content-Length, RFC 2616 section 14.13 |
|---|
| 790 | |
|---|
| 791 | Unlike other headers, use the CGI variable instead. |
|---|
| 792 | """ |
|---|
| 793 | version = "1.0" |
|---|
| 794 | _environ_name = 'CONTENT_LENGTH' |
|---|
| 795 | |
|---|
| 796 | _ContentLength('Content-Length', 'entity', 'RFC 2616, 14.13') |
|---|
| 797 | |
|---|
| 798 | class _ContentDisposition(_SingleValueHeader): |
|---|
| 799 | """ |
|---|
| 800 | Content-Disposition, RFC 2183 (use ``CONTENT_DISPOSITION``) |
|---|
| 801 | |
|---|
| 802 | This header can be constructed (using keyword arguments), |
|---|
| 803 | by first specifying one of the following mechanisms: |
|---|
| 804 | |
|---|
| 805 | ``attachment`` |
|---|
| 806 | |
|---|
| 807 | if True, this specifies that the content should not be |
|---|
| 808 | shown in the browser and should be handled externally, |
|---|
| 809 | even if the browser could render the content |
|---|
| 810 | |
|---|
| 811 | ``inline`` |
|---|
| 812 | |
|---|
| 813 | exclusive with attachment; indicates that the content |
|---|
| 814 | should be rendered in the browser if possible, but |
|---|
| 815 | otherwise it should be handled externally |
|---|
| 816 | |
|---|
| 817 | Only one of the above 2 may be True. If both are None, then |
|---|
| 818 | the disposition is assumed to be an ``attachment``. These are |
|---|
| 819 | distinct fields since support for field enumeration may be |
|---|
| 820 | added in the future. |
|---|
| 821 | |
|---|
| 822 | ``filename`` |
|---|
| 823 | |
|---|
| 824 | the filename parameter, if any, to be reported; if |
|---|
| 825 | this is None, then the current object's filename |
|---|
| 826 | attribute is used |
|---|
| 827 | |
|---|
| 828 | The usage of ``apply()`` on this header has side-effects. If |
|---|
| 829 | filename is provided, and Content-Type is not set or is |
|---|
| 830 | 'application/octet-stream', then the mimetypes.guess is used to |
|---|
| 831 | upgrade the Content-Type setting. |
|---|
| 832 | """ |
|---|
| 833 | |
|---|
| 834 | def _compose(self, attachment=None, inline=None, filename=None): |
|---|
| 835 | result = [] |
|---|
| 836 | if inline is True: |
|---|
| 837 | assert not attachment |
|---|
| 838 | result.append('inline') |
|---|
| 839 | else: |
|---|
| 840 | assert not inline |
|---|
| 841 | result.append('attachment') |
|---|
| 842 | if filename: |
|---|
| 843 | assert '"' not in filename |
|---|
| 844 | filename = filename.split("/")[-1] |
|---|
| 845 | filename = filename.split("\\")[-1] |
|---|
| 846 | result.append('filename="%s"' % filename) |
|---|
| 847 | return (("; ".join(result),), filename) |
|---|
| 848 | |
|---|
| 849 | def compose(self, **kwargs): |
|---|
| 850 | (result, mimetype) = self._compose(**kwargs) |
|---|
| 851 | return result |
|---|
| 852 | |
|---|
| 853 | def apply(self, collection, **kwargs): |
|---|
| 854 | """ return the new Content-Type side-effect value """ |
|---|
| 855 | (result, filename) = self._compose(**kwargs) |
|---|
| 856 | mimetype = CONTENT_TYPE(collection) |
|---|
| 857 | if filename and (not mimetype or CONTENT_TYPE.UNKNOWN == mimetype): |
|---|
| 858 | mimetype, _ = guess_type(filename) |
|---|
| 859 | if mimetype and CONTENT_TYPE.UNKNOWN != mimetype: |
|---|
| 860 | CONTENT_TYPE.update(collection, mimetype) |
|---|
| 861 | self.update(collection, *result) |
|---|
| 862 | return mimetype |
|---|
| 863 | |
|---|
| 864 | _ContentDisposition('Content-Disposition', 'entity', 'RFC 2183') |
|---|
| 865 | |
|---|
| 866 | class _IfModifiedSince(_DateHeader): |
|---|
| 867 | """ |
|---|
| 868 | If-Modified-Since, RFC 2616 section 14.25 |
|---|
| 869 | """ |
|---|
| 870 | version = '1.0' |
|---|
| 871 | |
|---|
| 872 | def __call__(self, *args, **kwargs): |
|---|
| 873 | """ |
|---|
| 874 | Split the value on ';' incase the header includes extra attributes. E.g. |
|---|
| 875 | IE 6 is known to send: |
|---|
| 876 | If-Modified-Since: Sun, 25 Jun 2006 20:36:35 GMT; length=1506 |
|---|
| 877 | """ |
|---|
| 878 | return _DateHeader.__call__(self, *args, **kwargs).split(';', 1)[0] |
|---|
| 879 | |
|---|
| 880 | def parse(self, *args, **kwargs): |
|---|
| 881 | value = _DateHeader.parse(self, *args, **kwargs) |
|---|
| 882 | if value and value > now(): |
|---|
| 883 | raise HTTPBadRequest(( |
|---|
| 884 | "Please check your system clock.\r\n" |
|---|
| 885 | "According to this server, the time provided in the\r\n" |
|---|
| 886 | "%s header is in the future.\r\n") % self.name) |
|---|
| 887 | return value |
|---|
| 888 | _IfModifiedSince('If-Modified-Since', 'request', 'RFC 2616, 14.25') |
|---|
| 889 | |
|---|
| 890 | class _Range(_MultiValueHeader): |
|---|
| 891 | """ |
|---|
| 892 | Range, RFC 2616 14.35 (use ``RANGE``) |
|---|
| 893 | |
|---|
| 894 | According to section 14.16, the response to this message should be a |
|---|
| 895 | 206 Partial Content and that if multiple non-overlapping byte ranges |
|---|
| 896 | are requested (it is an error to request multiple overlapping |
|---|
| 897 | ranges) the result should be sent as multipart/byteranges mimetype. |
|---|
| 898 | |
|---|
| 899 | The server should respond with '416 Requested Range Not Satisfiable' |
|---|
| 900 | if the requested ranges are out-of-bounds. The specification also |
|---|
| 901 | indicates that a syntax error in the Range request should result in |
|---|
| 902 | the header being ignored rather than a '400 Bad Request'. |
|---|
| 903 | """ |
|---|
| 904 | |
|---|
| 905 | def parse(self, *args, **kwargs): |
|---|
| 906 | """ |
|---|
| 907 | Returns a tuple (units, list), where list is a sequence of |
|---|
| 908 | (begin, end) tuples; and end is None if it was not provided. |
|---|
| 909 | """ |
|---|
| 910 | value = self.__call__(*args, **kwargs) |
|---|
| 911 | if not value: |
|---|
| 912 | return None |
|---|
| 913 | ranges = [] |
|---|
| 914 | last_end = -1 |
|---|
| 915 | try: |
|---|
| 916 | (units, range) = value.split("=", 1) |
|---|
| 917 | units = units.strip().lower() |
|---|
| 918 | for item in range.split(","): |
|---|
| 919 | (begin, end) = item.split("-") |
|---|
| 920 | if not begin.strip(): |
|---|
| 921 | begin = 0 |
|---|
| 922 | else: |
|---|
| 923 | begin = int(begin) |
|---|
| 924 | if begin <= last_end: |
|---|
| 925 | raise ValueError() |
|---|
| 926 | if not end.strip(): |
|---|
| 927 | end = None |
|---|
| 928 | else: |
|---|
| 929 | end = int(end) |
|---|
| 930 | last_end = end |
|---|
| 931 | ranges.append((begin, end)) |
|---|
| 932 | except ValueError: |
|---|
| 933 | # In this case where the Range header is malformed, |
|---|
| 934 | # section 14.16 says to treat the request as if the |
|---|
| 935 | # Range header was not present. How do I log this? |
|---|
| 936 | return None |
|---|
| 937 | return (units, ranges) |
|---|
| 938 | _Range('Range', 'request', 'RFC 2616, 14.35') |
|---|
| 939 | |
|---|
| 940 | class _AcceptLanguage(_MultiValueHeader): |
|---|
| 941 | """ |
|---|
| 942 | Accept-Language, RFC 2616 section 14.4 |
|---|
| 943 | """ |
|---|
| 944 | |
|---|
| 945 | def parse(self, *args, **kwargs): |
|---|
| 946 | """ |
|---|
| 947 | Return a list of language tags sorted by their "q" values. For example, |
|---|
| 948 | "en-us,en;q=0.5" should return ``["en-us", "en"]``. If there is no |
|---|
| 949 | ``Accept-Language`` header present, default to ``[]``. |
|---|
| 950 | """ |
|---|
| 951 | header = self.__call__(*args, **kwargs) |
|---|
| 952 | if header is None: |
|---|
| 953 | return [] |
|---|
| 954 | langs = [v for v in header.split(",") if v] |
|---|
| 955 | qs = [] |
|---|
| 956 | for lang in langs: |
|---|
| 957 | pieces = lang.split(";") |
|---|
| 958 | lang, params = pieces[0].strip().lower(), pieces[1:] |
|---|
| 959 | q = 1 |
|---|
| 960 | for param in params: |
|---|
| 961 | if '=' not in param: |
|---|
| 962 | # Malformed request; probably a bot, we'll ignore |
|---|
| 963 | continue |
|---|
| 964 | lvalue, rvalue = param.split("=") |
|---|
| 965 | lvalue = lvalue.strip().lower() |
|---|
| 966 | rvalue = rvalue.strip() |
|---|
| 967 | if lvalue == "q": |
|---|
| 968 | q = float(rvalue) |
|---|
| 969 | qs.append((lang, q)) |
|---|
| 970 | qs.sort(lambda a, b: -cmp(a[1], b[1])) |
|---|
| 971 | return [lang for (lang, q) in qs] |
|---|
| 972 | _AcceptLanguage('Accept-Language', 'request', 'RFC 2616, 14.4') |
|---|
| 973 | |
|---|
| 974 | class _AcceptRanges(_MultiValueHeader): |
|---|
| 975 | """ |
|---|
| 976 | Accept-Ranges, RFC 2616 section 14.5 |
|---|
| 977 | """ |
|---|
| 978 | def compose(self, none=None, bytes=None): |
|---|
| 979 | if bytes: |
|---|
| 980 | return ('bytes',) |
|---|
| 981 | return ('none',) |
|---|
| 982 | _AcceptRanges('Accept-Ranges', 'response', 'RFC 2616, 14.5') |
|---|
| 983 | |
|---|
| 984 | class _ContentRange(_SingleValueHeader): |
|---|
| 985 | """ |
|---|
| 986 | Content-Range, RFC 2616 section 14.6 |
|---|
| 987 | """ |
|---|
| 988 | def compose(self, first_byte=None, last_byte=None, total_length=None): |
|---|
| 989 | retval = "bytes %d-%d/%d" % (first_byte, last_byte, total_length) |
|---|
| 990 | assert last_byte == -1 or first_byte <= last_byte |
|---|
| 991 | assert last_byte < total_length |
|---|
| 992 | return (retval,) |
|---|
| 993 | _ContentRange('Content-Range', 'entity', 'RFC 2616, 14.6') |
|---|
| 994 | |
|---|
| 995 | class _Authorization(_SingleValueHeader): |
|---|
| 996 | """ |
|---|
| 997 | Authorization, RFC 2617 (RFC 2616, 14.8) |
|---|
| 998 | """ |
|---|
| 999 | def compose(self, digest=None, basic=None, username=None, password=None, |
|---|
| 1000 | challenge=None, path=None, method=None): |
|---|
| 1001 | assert username and password |
|---|
| 1002 | if basic or not challenge: |
|---|
| 1003 | assert not digest |
|---|
| 1004 | userpass = "%s:%s" % (username.strip(), password.strip()) |
|---|
| 1005 | return "Basic %s" % userpass.encode('base64').strip() |
|---|
| 1006 | assert challenge and not basic |
|---|
| 1007 | path = path or "/" |
|---|
| 1008 | (_, realm) = challenge.split('realm="') |
|---|
| 1009 | (realm, _) = realm.split('"', 1) |
|---|
| 1010 | auth = urllib2.AbstractDigestAuthHandler() |
|---|
| 1011 | auth.add_password(realm, path, username, password) |
|---|
| 1012 | (token, challenge) = challenge.split(' ', 1) |
|---|
| 1013 | chal = urllib2.parse_keqv_list(urllib2.parse_http_list(challenge)) |
|---|
| 1014 | class FakeRequest(object): |
|---|
| 1015 | def get_full_url(self): |
|---|
| 1016 | return path |
|---|
| 1017 | def has_data(self): |
|---|
| 1018 | return False |
|---|
| 1019 | def get_method(self): |
|---|
| 1020 | return method or "GET" |
|---|
| 1021 | get_selector = get_full_url |
|---|
| 1022 | retval = "Digest %s" % auth.get_authorization(FakeRequest(), chal) |
|---|
| 1023 | return (retval,) |
|---|
| 1024 | _Authorization('Authorization', 'request', 'RFC 2617') |
|---|
| 1025 | |
|---|
| 1026 | # |
|---|
| 1027 | # For now, construct a minimalistic version of the field-names; at a |
|---|
| 1028 | # later date more complicated headers may sprout content constructors. |
|---|
| 1029 | # The items commented out have concrete variants. |
|---|
| 1030 | # |
|---|
| 1031 | for (name, category, version, style, comment) in \ |
|---|
| 1032 | (("Accept" ,'request' ,'1.1','multi-value','RFC 2616, 14.1' ) |
|---|
| 1033 | ,("Accept-Charset" ,'request' ,'1.1','multi-value','RFC 2616, 14.2' ) |
|---|
| 1034 | ,("Accept-Encoding" ,'request' ,'1.1','multi-value','RFC 2616, 14.3' ) |
|---|
| 1035 | #,("Accept-Language" ,'request' ,'1.1','multi-value','RFC 2616, 14.4' ) |
|---|
| 1036 | #,("Accept-Ranges" ,'response','1.1','multi-value','RFC 2616, 14.5' ) |
|---|
| 1037 | ,("Age" ,'response','1.1','singular' ,'RFC 2616, 14.6' ) |
|---|
| 1038 | ,("Allow" ,'entity' ,'1.0','multi-value','RFC 2616, 14.7' ) |
|---|
| 1039 | #,("Authorization" ,'request' ,'1.0','singular' ,'RFC 2616, 14.8' ) |
|---|
| 1040 | #,("Cache-Control" ,'general' ,'1.1','multi-value','RFC 2616, 14.9' ) |
|---|
| 1041 | ,("Cookie" ,'request' ,'1.0','multi-value','RFC 2109/Netscape') |
|---|
| 1042 | ,("Connection" ,'general' ,'1.1','multi-value','RFC 2616, 14.10') |
|---|
| 1043 | ,("Content-Encoding" ,'entity' ,'1.0','multi-value','RFC 2616, 14.11') |
|---|
| 1044 | #,("Content-Disposition",'entity' ,'1.1','multi-value','RFC 2616, 15.5' ) |
|---|
| 1045 | ,("Content-Language" ,'entity' ,'1.1','multi-value','RFC 2616, 14.12') |
|---|
| 1046 | #,("Content-Length" ,'entity' ,'1.0','singular' ,'RFC 2616, 14.13') |
|---|
| 1047 | ,("Content-Location" ,'entity' ,'1.1','singular' ,'RFC 2616, 14.14') |
|---|
| 1048 | ,("Content-MD5" ,'entity' ,'1.1','singular' ,'RFC 2616, 14.15') |
|---|
| 1049 | #,("Content-Range" ,'entity' ,'1.1','singular' ,'RFC 2616, 14.16') |
|---|
| 1050 | #,("Content-Type" ,'entity' ,'1.0','singular' ,'RFC 2616, 14.17') |
|---|
| 1051 | ,("Date" ,'general' ,'1.0','date-header','RFC 2616, 14.18') |
|---|
| 1052 | ,("ETag" ,'response','1.1','singular' ,'RFC 2616, 14.19') |
|---|
| 1053 | ,("Expect" ,'request' ,'1.1','multi-value','RFC 2616, 14.20') |
|---|
| 1054 | ,("Expires" ,'entity' ,'1.0','date-header','RFC 2616, 14.21') |
|---|
| 1055 | ,("From" ,'request' ,'1.0','singular' ,'RFC 2616, 14.22') |
|---|
| 1056 | ,("Host" ,'request' ,'1.1','singular' ,'RFC 2616, 14.23') |
|---|
| 1057 | ,("If-Match" ,'request' ,'1.1','multi-value','RFC 2616, 14.24') |
|---|
| 1058 | #,("If-Modified-Since" ,'request' ,'1.0','date-header','RFC 2616, 14.25') |
|---|
| 1059 | ,("If-None-Match" ,'request' ,'1.1','multi-value','RFC 2616, 14.26') |
|---|
| 1060 | ,("If-Range" ,'request' ,'1.1','singular' ,'RFC 2616, 14.27') |
|---|
| 1061 | ,("If-Unmodified-Since",'request' ,'1.1','date-header' ,'RFC 2616, 14.28') |
|---|
| 1062 | ,("Last-Modified" ,'entity' ,'1.0','date-header','RFC 2616, 14.29') |
|---|
| 1063 | ,("Location" ,'response','1.0','singular' ,'RFC 2616, 14.30') |
|---|
| 1064 | ,("Max-Forwards" ,'request' ,'1.1','singular' ,'RFC 2616, 14.31') |
|---|
| 1065 | ,("Pragma" ,'general' ,'1.0','multi-value','RFC 2616, 14.32') |
|---|
| 1066 | ,("Proxy-Authenticate" ,'response','1.1','multi-value','RFC 2616, 14.33') |
|---|
| 1067 | ,("Proxy-Authorization",'request' ,'1.1','singular' ,'RFC 2616, 14.34') |
|---|
| 1068 | #,("Range" ,'request' ,'1.1','multi-value','RFC 2616, 14.35') |
|---|
| 1069 | ,("Referer" ,'request' ,'1.0','singular' ,'RFC 2616, 14.36') |
|---|
| 1070 | ,("Retry-After" ,'response','1.1','singular' ,'RFC 2616, 14.37') |
|---|
| 1071 | ,("Server" ,'response','1.0','singular' ,'RFC 2616, 14.38') |
|---|
| 1072 | ,("Set-Cookie" ,'response','1.0','multi-entry','RFC 2109/Netscape') |
|---|
| 1073 | ,("TE" ,'request' ,'1.1','multi-value','RFC 2616, 14.39') |
|---|
| 1074 | ,("Trailer" ,'general' ,'1.1','multi-value','RFC 2616, 14.40') |
|---|
| 1075 | ,("Transfer-Encoding" ,'general' ,'1.1','multi-value','RFC 2616, 14.41') |
|---|
| 1076 | ,("Upgrade" ,'general' ,'1.1','multi-value','RFC 2616, 14.42') |
|---|
| 1077 | ,("User-Agent" ,'request' ,'1.0','singular' ,'RFC 2616, 14.43') |
|---|
| 1078 | ,("Vary" ,'response','1.1','multi-value','RFC 2616, 14.44') |
|---|
| 1079 | ,("Via" ,'general' ,'1.1','multi-value','RFC 2616, 14.45') |
|---|
| 1080 | ,("Warning" ,'general' ,'1.1','multi-entry','RFC 2616, 14.46') |
|---|
| 1081 | ,("WWW-Authenticate" ,'response','1.0','multi-entry','RFC 2616, 14.47')): |
|---|
| 1082 | klass = {'multi-value': _MultiValueHeader, |
|---|
| 1083 | 'multi-entry': _MultiEntryHeader, |
|---|
| 1084 | 'date-header': _DateHeader, |
|---|
| 1085 | 'singular' : _SingleValueHeader}[style] |
|---|
| 1086 | klass(name, category, comment, version).__doc__ = comment |
|---|
| 1087 | del klass |
|---|
| 1088 | |
|---|
| 1089 | for head in _headers.values(): |
|---|
| 1090 | headname = head.name.replace("-","_").upper() |
|---|
| 1091 | locals()[headname] = head |
|---|
| 1092 | __all__.append(headname) |
|---|
| 1093 | |
|---|
| 1094 | __pudge_all__ = __all__[:] |
|---|
| 1095 | for _name, _obj in globals().items(): |
|---|
| 1096 | if isinstance(_obj, type) and issubclass(_obj, HTTPHeader): |
|---|
| 1097 | __pudge_all__.append(_name) |
|---|