On This Page

Previous topic

4. Py4J Python API

Next topic

4.2. py4j.clientserver — Py4J Single Threading Model Implementation

This Page

4.1. py4j.java_gateway — Py4J Main API

The py4j.java_gateway module defines most of the classes that are needed to use Py4J. Py4J users are expected to only use explicitly JavaGateway and optionally, GatewayParameters, CallbackServerParameters, java_import, get_field, get_method, launch_gateway, and is_instance_of. The other module members are documented to support the extension of Py4J.

4.1.1. JavaGateway

class py4j.java_gateway.JavaGateway(gateway_client=None, auto_field=False, python_proxy_port=25334, start_callback_server=False, auto_convert=False, eager_load=False, gateway_parameters=None, callback_server_parameters=None, python_server_entry_point=None, java_process=None)
A JavaGateway is the main interaction point between a Python VM and

a JVM.

  • A JavaGateway instance is connected to a Gateway instance on the Java side.

  • The entry_point field of a JavaGateway instance is connected to the Gateway.entryPoint instance on the Java side.

  • The java_gateway_server field of a JavaGateway instance is connected to the GatewayServer instance on the Java side.

  • The jvm field of JavaGateway enables user to access classes, static members (fields and methods) and call constructors.

  • The java_process field of a JavaGateway instance is a subprocess.Popen object for the Java process that the JavaGateway is connected to, or None if the JavaGateway connected to a preexisting Java process (in which case we cannot directly access that process from Python).

Methods that are not defined by JavaGateway are always redirected to entry_point. For example, gateway.doThat() is equivalent to gateway.entry_point.doThat(). This is a trade-off between convenience and potential confusion.

Parameters:
  • gateway_parameters – An instance of GatewayParameters used to configure the various options of the gateway.

  • callback_server_parameters – An instance of CallbackServerParameters used to configure various options of the gateway server. Must be provided to start a gateway server. Otherwise, callbacks won”t be available.

  • python_server_entry_point – can be requested by the Java side if Java is driving the communication.

  • java_process – the subprocess.Popen object for the Java process that the JavaGateway shall connect to, if available.

close(keep_callback_server=False, close_callback_server_connections=False)
Closes all gateway connections. A connection will be reopened if

necessary (e.g., if a JavaMethod is called).

Parameters:
  • keep_callback_server – if True, the callback server is not shut down. Mutually exclusive with close_callback_server_connections.

  • close_callback_server_connections – if True, close all callback server connections.

close_callback_server(raise_exception=False)
Closes the

CallbackServer connections.

Parameters:

raise_exception – If True, raise an exception if an error occurs while closing the callback server connections (very likely with sockets).

detach(java_object)

Makes the Java Gateway dereference this object.

The equivalent of this method is called when a JavaObject instance is garbage collected on the Python side. This method, or gc.collect() should still be invoked when memory is limited or when too many objects are created on the Java side.

Parameters:

java_object – The JavaObject instance to dereference (free) on the Java side.

get_callback_server()
help(var, pattern=None, short_name=True, display=True)

Displays a help page about a class or an object.

Parameters:
  • var – JavaObject, JavaClass or JavaMember for which a help page will be generated.

  • pattern – Star-pattern used to filter the members. For example “get*Foo” may return getMyFoo, getFoo, getFooBar, but not bargetFoo. The pattern is matched against the entire signature. To match only the name of a method, use “methodName(*”.

  • short_name – If True, only the simple name of the parameter types and return types will be displayed. If False, the fully qualified name of the types will be displayed.

  • display – If True, the help page is displayed in an interactive page similar to the help command in Python. If False, the page is returned as a string.

classmethod launch_gateway(port=0, jarpath='', classpath='', javaopts=[], die_on_exit=False, redirect_stdout=None, redirect_stderr=None, daemonize_redirect=True, java_path='java', create_new_process_group=False, enable_auth=False, cwd=None, use_shell=False)

Launch a Gateway in a new Java process and create a default JavaGateway to connect to it.

See launch_gateway for more information about this function.

Parameters:
  • port – the port to launch the Java Gateway on. If no port is specified then an ephemeral port is used.

  • jarpath – the path to the Py4J jar. Only necessary if the jar was installed at a non-standard location or if Python is using a different sys.prefix than the one that Py4J was installed under.

  • classpath – the classpath used to launch the Java Gateway.

  • javaopts – an array of extra options to pass to Java (the classpath should be specified using the classpath parameter, not javaopts.)

  • die_on_exit – if True, the Java gateway process will die when this Python process exits or is killed.

  • redirect_stdout – where to redirect the JVM stdout. If None (default) stdout is redirected to os.devnull. Otherwise accepts a file descriptor, a queue, or a deque. Will send one line at a time to these objects.

  • redirect_stderr – where to redirect the JVM stdout. If None (default) stderr is redirected to os.devnull. Otherwise accepts a file descriptor, a queue, or a deque. Will send one line at a time to these objects.

  • daemonize_redirect – if True, the consumer threads will be daemonized and will not prevent the main Python process from exiting. This means the file descriptors (stderr, stdout, redirect_stderr, redirect_stdout) might not be properly closed. This is not usually a problem, but in case of errors related to file descriptors, set this flag to False.

  • java_path – If None, Py4J will use $JAVA_HOME/bin/java if $JAVA_HOME is defined, otherwise it will use “java”.

  • create_new_process_group – If True, the JVM is started in a new process group. This ensures that signals sent to the parent Python process are not forwarded to the JVM. For example, sending Ctrl-C/SIGINT won’t interrupt the JVM. If the python process dies, the Java process will stay alive, which may be a problem for some scenarios though.

  • enable_auth – If True, the server will require clients to provide an authentication token when connecting.

  • cwd – If not None, path that will be used as the current working directory of the Java process.

  • use_shell – If True, Popen will be start the java process with shell=True

Return type:

a JavaGateway connected to the Gateway server.

new_array(java_class, *dimensions)

Creates a Java array of type java_class of dimensions

Parameters:
  • java_class – The JavaClass instance representing the type of the array.

  • dimensions – A list of dimensions of the array. For example [1,2] would produce an array[1][2].

Return type:

A JavaArray instance.

new_jvm_view(name='custom jvm')

Creates a new JVM view with its own imports. A JVM view ensures that the import made in one view does not conflict with the import of another view.

Generally, each Python module should have its own view (to replicate Java behavior).

Parameters:

name – Optional name of the jvm view. Does not need to be unique, i.e., two distinct views can have the same name (internally, they will have a distinct id).

Return type:

A JVMView instance (same class as the gateway.jvm instance).

restart_callback_server()

Shuts down the callback server (if started) and restarts a new one.

set_gateway_client(gateway_client)

Sets the gateway client for this JavaGateway. This sets the appropriate gateway_property and resets the main jvm view (self.jvm).

This is for advanced usage only. And should only be set before the gateway is loaded.

shutdown(raise_exception=False)
Shuts down the GatewayClient and the

CallbackServer.

Parameters:

raise_exception – If True, raise an exception if an error occurs while shutting down (very likely with sockets).

shutdown_callback_server(raise_exception=False)
Shuts down the

CallbackServer.

Parameters:

raise_exception – If True, raise an exception if an error occurs while shutting down (very likely with sockets).

start_callback_server(callback_server_parameters=None)

Starts the callback server.

Parameters:

callback_server_parameters – parameters to use to start the server. If not provided, it will use the gateway callback server parameters.

Return type:

Returns True if the server was started by this call or False if it was already started (you cannot have more than one started callback server).

4.1.1.1. Examples

Using the jvm property:

>>> gateway = JavaGateway()
>>> jvm = gateway.jvm
>>> l = jvm.java.util.ArrayList()
>>> l.append(10)
>>> l.append(1)
>>> jvm.java.util.Collections.sort(l)
>>> l
[1, 10]
>>> l.append(5)
>>> l.sort()
>>> l
[1, 5, 10]

Using auto_field:

First we declare a class that has a field AND a method called member:

package py4j.examples;
public class ExampleWithField {
    public int member = 1;
    public String member() {
        return "Hello World";
    }
}

Then we play with the class using the two possible values of auto_field:

>>> java_gateway = JavaGateway() # auto_field = False
>>> example = java_gateway.jvm.py4j.examples.ExampleWithField()
>>> example.member()
u'Hello World'
>>> get_field(example,'member')
1
>>> java_gateway2 = JavaGateway(gateway_parameters=GatewayParameters(auto_field=True))
>>> example2 = java_gateway2.jvm.py4j.examples.ExampleWithField()
>>> example2.member
1
>>> get_method(example2,'member')()
u'Hello World'

4.1.2. GatewayParameters

class py4j.java_gateway.GatewayParameters(address='127.0.0.1', port=25333, auto_field=False, auto_close=True, auto_convert=False, eager_load=False, ssl_context=None, enable_memory_management=True, read_timeout=None, auth_token=None)

Wrapper class that contains all parameters that can be passed to configure a JavaGateway

Parameters:
  • address – the address to which the client will request a connection. If you’re assing a SSLContext with check_hostname=True then this address must match (one of) the hostname(s) in the certificate the gateway server presents.

  • port – the port to which the client will request a connection. Default is 25333.

  • auto_field – if False, each object accessed through this gateway won”t try to lookup fields (they will be accessible only by calling get_field). If True, fields will be automatically looked up, possibly hiding methods of the same name and making method calls less efficient.

  • auto_close – if True, the connections created by the client close the socket when they are garbage collected.

  • auto_convert – if True, try to automatically convert Python objects like sequences and maps to Java Objects. Default value is False to improve performance and because it is still possible to explicitly perform this conversion.

  • eager_load – if True, the gateway tries to connect to the JVM by calling System.currentTimeMillis. If the gateway cannot connect to the JVM, it shuts down itself and raises an exception.

  • ssl_context – if not None, SSL connections will be made using this SSLContext

  • enable_memory_management – if True, tells the Java side when a JavaObject (reference to an object on the Java side) is garbage collected on the Python side.

  • read_timeout – if > 0, sets a timeout in seconds after which the socket stops waiting for a response from the Java side.

  • auth_token – if provided, an authentication that token clients must provide to the server when connecting.

4.1.3. CallbackServerParameters

class py4j.java_gateway.CallbackServerParameters(address='127.0.0.1', port=25334, daemonize=False, daemonize_connections=False, eager_load=True, ssl_context=None, accept_timeout='DEFAULT', read_timeout=None, propagate_java_exceptions=False, auth_token=None)

Wrapper class that contains all parameters that can be passed to configure a CallbackServer

Parameters:
  • address – the address to which the client will request a connection

  • port – the port to which the client will request a connection. Default is 25334.

  • daemonize – If True, will set the daemon property of the server thread to True. The callback server will exit automatically if all the other threads exit.

  • daemonize_connections – If True, callback server connections are executed in daemonized threads and will not block the exit of a program if non daemonized threads are finished.

  • eager_load – If True, the callback server is automatically started when the JavaGateway is created.

  • ssl_context – if not None, the SSLContext’s certificate will be presented to callback connections.

  • accept_timeout – if > 0, sets a timeout in seconds after which the callbackserver stops waiting for a connection, sees if the callback server should shut down, and if not, wait again for a connection. The default is 5 seconds: this roughly means that if can take up to 5 seconds to shut down the callback server.

  • read_timeout – if > 0, sets a timeout in seconds after which the socket stops waiting for a call or command from the Java side.

  • propagate_java_exceptions – if True, any Py4JJavaError raised by a Python callback will cause the nested java_exception to be thrown on the Java side. If False, the Py4JJavaError will manifest as a Py4JException on the Java side, just as with any other kind of Python exception. Setting this option is useful if you need to implement a Java interface where the user of the interface has special handling for specific Java exception types.

  • auth_token – if provided, an authentication token that clients must provide to the server when connecting.

4.1.4. GatewayClient

This is an internal class. Do not use it directly.

class py4j.java_gateway.GatewayClient(address='127.0.0.1', port=25333, auto_close=True, gateway_property=None, ssl_context=None, gateway_parameters=None)

Responsible for managing connections to the JavaGateway.

This implementation is thread-safe and connections are created on-demand. This means that Py4J-Python can be accessed by multiple threads and messages are sent to and processed concurrently by the Java Gateway.

When creating a custom JavaGateway, it is recommended to pass an instance of GatewayClient instead of a GatewayConnection: both have the same interface, but the client supports multiple threads and connections, which is essential when using callbacks.

Parameters:
  • gateway_parameters – the set of parameters used to configure the GatewayClient.

  • gateway_property – used to keep gateway preferences without a cycle with the gateway

close()

Closes all currently opened connections.

This operation is not thread safe and is only a best effort strategy to close active connections.

All connections are guaranteed to be closed only if no other thread is accessing the client and no call is pending.

garbage_collect_object(target_id)

Tells the Java side that there is no longer a reference to this JavaObject on the Python side.

send_command(command, retry=True, binary=False)
Sends a command to the JVM. This method is not intended to be

called directly by Py4J users. It is usually called by JavaMember instances.

Parameters:
  • command – the string command to send to the JVM. The command must follow the Py4J protocol.

  • retry – if True, the GatewayClient tries to resend a message if it fails.

  • binary – if True, we won’t wait for a Py4J-protocol response from the other end; we’ll just return the raw connection to the caller. The caller becomes the owner of the connection, and is responsible for closing the connection (or returning it this GatewayClient pool using _give_back_connection).

Return type:

the string answer received from the JVM (The answer follows the Py4J protocol). The guarded GatewayConnection is also returned if binary is True.

shutdown_gateway()

Sends a shutdown command to the gateway. This will close the gateway server: all active connections will be closed. This may be useful if the lifecycle of the Java program must be tied to the Python program.

4.1.5. GatewayConnection

This is an internal class. Do not use it directly.

class py4j.java_gateway.GatewayConnection(gateway_parameters, gateway_property=None)

Default gateway connection (socket based) responsible for communicating with the Java Virtual Machine.

Parameters:
  • gateway_parameters – the set of parameters used to configure the GatewayClient.

  • gateway_property – contains gateway preferences to avoid a cycle with gateway

close(reset=False)

Closes the connection by closing the socket.

If reset is True, sends a RST packet with SO_LINGER

send_command(command)
Sends a command to the JVM. This method is not intended to be

called directly by Py4J users: it is usually called by JavaMember instances.

Parameters:

command – the string command to send to the JVM. The command must follow the Py4J protocol.

Return type:

the string answer received from the JVM (The answer follows the Py4J protocol).

shutdown_gateway()

Sends a shutdown command to the gateway. This will close the gateway server: all active connections will be closed. This may be useful if the lifecycle of the Java program must be tied to the Python program.

start()

Starts the connection by connecting to the address and the port

4.1.6. JVMView

This is an internal class. Do not use it directly.

class py4j.java_gateway.JVMView(gateway_client, jvm_name, id=None, jvm_object=None)

A JVMView allows access to the Java Virtual Machine of a JavaGateway.

This can be used to reference static members (fields and methods) and to call constructors.

4.1.7. JavaObject

This is an internal class. Do not use it directly.

Represents a Java object from which you can call methods or access fields.

4.1.8. JavaMember

This is an internal class. Do not use it directly.

Represents a member (i.e., method) of a JavaObject. For now, only methods are supported. Fields are retrieved directly and are not contained in a JavaMember.

4.1.9. JavaClass

This is an internal class. Do not use it directly.

class py4j.java_gateway.JavaClass(fqn, gateway_client)

A JavaClass represents a Java Class from which static members can be retrieved. JavaClass instances are also needed to initialize an array.

Usually, JavaClass are not initialized using their constructor, but they are created while accessing the jvm property of a gateway, e.g., gateway.jvm.java.lang.String.

_java_lang_class

Property that returns the java.lang.Class associated with this JavaClass. Equivalent to calling .class in Java.

4.1.10. JavaPackage

This is an internal class. Do not use it directly.

class py4j.java_gateway.JavaPackage(fqn, gateway_client, jvm_id=None)

A JavaPackage represents part of a Java package from which Java classes can be accessed.

Usually, JavaPackage are not initialized using their constructor, but they are created while accessing the jvm property of a gateway, e.g., gateway.jvm.java.lang.

4.1.11. PythonProxyPool

This is an internal class. Do not use it directly.

class py4j.java_gateway.PythonProxyPool

A PythonProxyPool manages proxies that are passed to the Java side. A proxy is a Python class that implements a Java interface.

A proxy has an internal class named Java with a member named implements which is a list of fully qualified names (string) of the implemented interfaces.

The PythonProxyPool implements a subset of the dict interface: pool[id], del(pool[id]), pool.put(proxy), pool.clear(), id in pool, len(pool).

The PythonProxyPool is thread-safe.

clear()
put(object, force_id=None)

Adds a proxy to the pool.

Parameters:

object – The proxy to add to the pool.

Return type:

A unique identifier associated with the object.

4.1.12. CallbackServer

class py4j.java_gateway.CallbackServer(pool, gateway_client, port=25334, address='127.0.0.1', callback_server_parameters=None)

The CallbackServer is responsible for receiving call back connection requests from the JVM. Usually connections are reused on the Java side, but there is at least one connection per concurrent thread.

Parameters:
  • pool – the pool responsible of tracking Python objects passed to the Java side.

  • gateway_client – the gateway client used to call Java objects.

  • callback_server_parameters – An instance of CallbackServerParameters used to configure various options of the callback server.

close()

Closes all active callback connections

get_listening_address()

Returns the address on which the callback server is listening to. May be different than address if address was an alias (e.g., localhost).

get_listening_port()

Returns the port on which the callback server is listening to. Different than port when port is 0.

run()

Starts listening and accepting connection requests.

This method is called when invoking CallbackServer.start(). A CallbackServer instance is created and started automatically when a JavaGateway instance is created.

shutdown()

Stops listening and accepting connection requests. All live connections are closed.

This method can safely be called by another thread.

start()

Starts the CallbackServer. This method should be called by the client instead of run().

4.1.13. CallbackConnection

This is an internal class. Do not use it directly.

class py4j.java_gateway.CallbackConnection(pool, input, socket_instance, gateway_client, callback_server_parameters, callback_server)

A CallbackConnection receives callbacks and garbage collection requests from the Java side.

This constructor should always be called with keyword arguments. Arguments are:

group should be None; reserved for future extension when a ThreadGroup class is implemented.

target is the callable object to be invoked by the run() method. Defaults to None, meaning nothing is called.

name is the thread name. By default, a unique name is constructed of the form “Thread-N” where N is a small decimal number.

args is the argument tuple for the target invocation. Defaults to ().

kwargs is a dictionary of keyword arguments for the target invocation. Defaults to {}.

If a subclass overrides the constructor, it must make sure to invoke the base class constructor (Thread.__init__()) before doing anything else to the thread.

close(reset=False)
run()

Method representing the thread’s activity.

You may override this method in a subclass. The standard run() method invokes the callable object passed to the object’s constructor as the target argument, if any, with sequential and keyword arguments taken from the args and kwargs arguments, respectively.

4.1.14. Py4J Core Signals

This is a list of signals that Py4J can send during various lifecycle events. They are all instances of Signal.

py4j.java_gateway.server_connection_stopped

Signal sent when a Python (Callback) Server connection is stopped.

Will supply the connection argument, an instance of CallbackConnection.

The sender is the CallbackServer instance.

py4j.java_gateway.server_connection_started

Signal sent when a Python (Callback) Server connection is started.

Will supply the connection argument, an instance of CallbackConnection.

The sender is the CallbackServer instance.

py4j.java_gateway.server_connection_error

Signal sent when a Python (Callback) Server encounters an error while waiting for a connection.

Will supply the error argument, an instance of Exception.

The sender is the CallbackServer instance.

py4j.java_gateway.server_started

Signal sent when a Python (Callback) Server is started

Will supply the server argument, an instance of CallbackServer

The sender is the CallbackServer instance, but it is not possible for now to bind to a CallbackServer instance before it is started (limitation of the current JavaGateway and ClientServer API).

py4j.java_gateway.server_stopped

Signal sent when a Python (Callback) Server is stopped

Will supply the server argument, an instance of CallbackServer

The sender is the CallbackServer instance.

py4j.java_gateway.pre_server_shutdown

Signal sent when a Python (Callback) Server is about to shut down.

Will supply the server argument, an instance of CallbackServer

The sender is the CallbackServer instance.

py4j.java_gateway.post_server_shutdown

Signal sent when a Python (Callback) Server is shut down.

Will supply the server argument, an instance of CallbackServer

The sender is the CallbackServer instance.

4.1.15. Py4J Functions

The following functions get be used to import packages or to get a particular field or method when fields and methods in a Java class have the same name:

py4j.java_gateway.java_import(jvm_view, import_str)

Imports the package or class specified by import_str in the jvm view namespace.

Parameters:

jvm_view – The jvm_view in which to import a class/package.

Import_str:

The class (e.g., java.util.List) or the package (e.g., java.io.*) to import

py4j.java_gateway.launch_gateway(port=0, jarpath='', classpath='', javaopts=[], die_on_exit=False, redirect_stdout=None, redirect_stderr=None, daemonize_redirect=True, java_path='java', create_new_process_group=False, enable_auth=False, cwd=None, return_proc=False, use_shell=False)

Launch a Gateway in a new Java process.

The redirect parameters accept file-like objects, Queue, or deque. When text lines are sent to the stdout or stderr of the child JVM, these lines are redirected to the file-like object (write(line)), the Queue (put(line)), or the deque (appendleft(line)).

The text line will contain a newline character.

Only text output is accepted on stdout and stderr. If you wish to communicate with the child JVM through bytes, you need to create your own helper function.

Parameters:
  • port – the port to launch the Java Gateway on. If no port is specified then an ephemeral port is used.

  • jarpath – the path to the Py4J jar. Only necessary if the jar was installed at a non-standard location or if Python is using a different sys.prefix than the one that Py4J was installed under.

  • classpath – the classpath used to launch the Java Gateway.

  • javaopts – an array of extra options to pass to Java (the classpath should be specified using the classpath parameter, not javaopts.)

  • die_on_exit – if True, the Java gateway process will die when this Python process exits or is killed.

  • redirect_stdout – where to redirect the JVM stdout. If None (default) stdout is redirected to os.devnull. Otherwise accepts a file descriptor, a queue, or a deque. Will send one line at a time to these objects.

  • redirect_stderr – where to redirect the JVM stdout. If None (default) stderr is redirected to os.devnull. Otherwise accepts a file descriptor, a queue, or a deque. Will send one line at a time to these objects.

  • daemonize_redirect – if True, the consumer threads will be daemonized and will not prevent the main Python process from exiting. This means the file descriptors (stderr, stdout, redirect_stderr, redirect_stdout) might not be properly closed. This is not usually a problem, but in case of errors related to file descriptors, set this flag to False.

  • java_path – If None, Py4J will use $JAVA_HOME/bin/java if $JAVA_HOME is defined, otherwise it will use “java”.

  • create_new_process_group – If True, the JVM is started in a new process group. This ensures that signals sent to the parent Python process are not forwarded to the JVM. For example, sending Ctrl-C/SIGINT won’t interrupt the JVM. If the python process dies, the Java process will stay alive, which may be a problem for some scenarios though.

  • enable_auth – If True, the server will require clients to provide an authentication token when connecting.

  • cwd – If not None, path that will be used as the current working directory of the Java process.

  • return_proc – If True, returns the Popen object returned when the JVM process was created.

  • use_shell – If True, Popen will be start the java process with shell=True

Return type:

the port number of the Gateway server or, when auth enabled, a 2-tuple with the port number and the auth token.

py4j.java_gateway.get_field(java_object, field_name)

Retrieves the field named field_name from the java_object.

This function is useful when auto_field=false in a gateway or Java object.

Parameters:
  • java_object – the instance containing the field

  • field_name – the name of the field to retrieve

py4j.java_gateway.set_field(java_object, field_name, value)

Sets the field named field_name of java_object to value.

This function is the only way to set a field because the assignment operator in Python cannot be overloaded.

Parameters:
  • java_object – the instance containing the field

  • field_name – the name of the field to set

  • value – the value to assign to the field

py4j.java_gateway.get_method(java_object, method_name)

Retrieves a reference to the method of an object.

This function is useful when auto_field=true and an instance field has the same name as a method. The full signature of the method is not required: it is determined when the method is called.

Parameters:
  • java_object – the instance containing the method

  • method_name – the name of the method to retrieve

py4j.java_gateway.is_instance_of(gateway, java_object, java_class)

Indicates whether a java object is an instance of the provided java_class.

Parameters:
  • gateway – the JavaGateway instance

  • java_object – the JavaObject instance

  • java_class – can be a string (fully qualified name), a JavaClass instance, or a JavaObject instance)

py4j.java_gateway.get_java_class(java_class)

Returns the java.lang.Class of a JavaClass. This is equivalent to calling .class in Java.

Parameters:

java_class – An instance of JavaClass

Return type:

An instance of JavaObject that corresponds to a java.lang.Class

Questions/Feedback?