Soup.URI

Fields

Name

Type

Access

Description

fragment

str

r/w

a fragment identifier within path, or None

host

str

r/w

the hostname or IP address, or None

password

str

r/w

a password, or None

path

str

r/w

the path on host

port

int

r/w

the port number on host

query

str

r/w

a query for path, or None

scheme

str

r/w

the URI scheme (eg, “http”)

user

str

r/w

a username, or None

Methods

class

decode (part)

class

encode (part, escape_extra)

class

new (uri_string)

class

new_with_base (base, uri_string)

class

normalize (part, unescape_extra)

copy ()

copy_host ()

equal (uri2)

free ()

get_fragment ()

get_host ()

get_password ()

get_path ()

get_port ()

get_query ()

get_scheme ()

get_user ()

host_equal (v2)

host_hash ()

set_fragment (fragment)

set_host (host)

set_password (password)

set_path (path)

set_port (port)

set_query (query)

set_query_from_form (form)

set_scheme (scheme)

set_user (user)

to_string (just_path_and_query)

uses_default_port ()

Details

class Soup.URI

A Soup.URI represents a (parsed) URI. Soup.URI supports RFC 3986 (URI Generic Syntax), and can parse any valid URI. However, libsoup only uses “http” and “https” URIs internally; You can use SOUP_URI_VALID_FOR_HTTP() to test if a Soup.URI is a valid HTTP URI.

scheme will always be set in any URI. It is an interned string and is always all lowercase. (If you parse a URI with a non-lowercase scheme, it will be converted to lowercase.) The macros %SOUP_URI_SCHEME_HTTP and %SOUP_URI_SCHEME_HTTPS provide the interned values for “http” and “https” and can be compared against URI scheme values.

user and password are parsed as defined in the older URI specs (ie, separated by a colon; RFC 3986 only talks about a single “userinfo” field). Note that password is not included in the output of Soup.URI.to_string(). libsoup does not normally use these fields; authentication is handled via Soup.Session signals.

host contains the hostname, and port the port specified in the URI. If the URI doesn’t contain a hostname, host will be None, and if it doesn’t specify a port, port may be 0. However, for “http” and “https” URIs, host is guaranteed to be non-None (trying to parse an http URI with no host will return None), and port will always be non-0 (because libsoup knows the default value to use when it is not specified in the URI).

path is always non-None. For http/https URIs, path will never be an empty string either; if the input URI has no path, the parsed Soup.URI will have a path of “/”.

query and fragment are optional for all URI types. Soup.form_decode() may be useful for parsing query.

Note that path, query, and fragment may contain % -encoded characters. Soup.URI.new() calls Soup.URI.normalize() on them, but not Soup.URI.decode(). This is necessary to ensure that Soup.URI.to_string() will generate a URI that has exactly the same meaning as the original. (In theory, Soup.URI should leave user, password, and host partially-encoded as well, but this would be more annoying than useful.)

classmethod decode(part)
Parameters:

part (str) – a URI part

Returns:

the decoded URI part.

Return type:

str

Fully % -decodes part.

In the past, this would return None if part contained invalid percent-encoding, but now it just ignores the problem (as Soup.URI.new() already did).

classmethod encode(part, escape_extra)
Parameters:
  • part (str) – a URI part

  • escape_extra (str or None) – additional reserved characters to escape (or None)

Returns:

the encoded URI part

Return type:

str

This % -encodes the given URI part and returns the escaped version in allocated memory, which the caller must free when it is done.

classmethod new(uri_string)
Parameters:

uri_string (str or None) – a URI

Returns:

a Soup.URI, or None if the given string was found to be invalid.

Return type:

Soup.URI or None

Parses an absolute URI.

You can also pass None for uri_string if you want to get back an “empty” Soup.URI that you can fill in by hand. (You will need to call at least Soup.URI.set_scheme() and Soup.URI.set_path(), since those fields are required.)

classmethod new_with_base(base, uri_string)
Parameters:
  • base (Soup.URI) – a base URI

  • uri_string (str) – the URI

Returns:

a parsed Soup.URI.

Return type:

Soup.URI

Parses uri_string relative to base.

classmethod normalize(part, unescape_extra)
Parameters:
  • part (str) – a URI part

  • unescape_extra (str or None) – reserved characters to unescape (or None)

Returns:

the normalized URI part

Return type:

str

% -decodes any “unreserved” characters (or characters in unescape_extra) in part, and % -encodes any non-ASCII characters, spaces, and non-printing characters in part.

“Unreserved” characters are those that are not allowed to be used for punctuation according to the URI spec. For example, letters are unreserved, so Soup.URI.normalize() will turn http://example.com/foo/b%61r into http://example.com/foo/bar, which is guaranteed to mean the same thing. However, “/” is “reserved”, so http://example.com/foo%2Fbar would not be changed, because it might mean something different to the server.

In the past, this would return None if part contained invalid percent-encoding, but now it just ignores the problem (as Soup.URI.new() already did).

copy()
Returns:

a copy of self, which must be freed with Soup.URI.free()

Return type:

Soup.URI

Copies self

copy_host()
Returns:

the new Soup.URI

Return type:

Soup.URI

Makes a copy of self, considering only the protocol, host, and port

New in version 2.28.

equal(uri2)
Parameters:

uri2 (Soup.URI) – another Soup.URI

Returns:

True or False

Return type:

bool

Tests whether or not self and uri2 are equal in all parts

free()

Frees self.

get_fragment()
Returns:

self's fragment.

Return type:

str

Gets self's fragment.

New in version 2.32.

get_host()
Returns:

self's host.

Return type:

str

Gets self's host.

New in version 2.32.

get_password()
Returns:

self's password.

Return type:

str

Gets self's password.

New in version 2.32.

get_path()
Returns:

self's path.

Return type:

str

Gets self's path.

New in version 2.32.

get_port()
Returns:

self's port.

Return type:

int

Gets self's port.

New in version 2.32.

get_query()
Returns:

self's query.

Return type:

str

Gets self's query.

New in version 2.32.

get_scheme()
Returns:

self's scheme.

Return type:

str

Gets self's scheme.

New in version 2.32.

get_user()
Returns:

self's user.

Return type:

str

Gets self's user.

New in version 2.32.

host_equal(v2)
Parameters:

v2 (Soup.URI) – a Soup.URI with a non-None host member

Returns:

whether or not the URIs are equal in scheme, host, and port.

Return type:

bool

Compares self and v2, considering only the scheme, host, and port.

New in version 2.28.

host_hash()
Returns:

a hash

Return type:

int

Hashes self, considering only the scheme, host, and port.

New in version 2.28.

set_fragment(fragment)
Parameters:

fragment (str or None) – the fragment

Sets self's fragment to fragment.

set_host(host)
Parameters:

host (str or None) – the hostname or IP address, or None

Sets self's host to host.

If host is an IPv6 IP address, it should not include the brackets required by the URI syntax; they will be added automatically when converting self to a string.

http and https URIs should not have a None host.

set_password(password)
Parameters:

password (str or None) – the password, or None

Sets self's password to password.

set_path(path)
Parameters:

path (str) – the non-None path

Sets self's path to path.

set_port(port)
Parameters:

port (int) – the port, or 0

Sets self's port to port. If port is 0, self will not have an explicitly-specified port.

set_query(query)
Parameters:

query (str or None) – the query

Sets self's query to query.

set_query_from_form(form)
Parameters:

form ({str: str}) – a GLib.HashTable containing HTML form information

Sets self's query to the result of encoding form according to the HTML form rules. See Soup.form_encode_hash() for more information.

set_scheme(scheme)
Parameters:

scheme (str) – the URI scheme

Sets self's scheme to scheme. This will also set self's port to the default port for scheme, if known.

set_user(user)
Parameters:

user (str or None) – the username, or None

Sets self's user to user.

to_string(just_path_and_query)
Parameters:

just_path_and_query (bool) – if True, output just the path and query portions

Returns:

a string representing self, which the caller must free.

Return type:

str

Returns a string representing self.

If just_path_and_query is True, this concatenates the path and query together. That is, it constructs the string that would be needed in the Request-Line of an HTTP request for self.

Note that the output will never contain a password, even if self does.

uses_default_port()
Returns:

True or False

Return type:

bool

Tests if self uses the default port for its scheme. (Eg, 80 for http.) (This only works for http, https and ftp; libsoup does not know the default ports of other protocols.)