Log to unified log from Python

Hi, I would like to log to the unified log from a Python program. I found (on Big Sur) that this can be done by writing to a UNIX domain socket at /var/run/syslog using UDP. The log records show up in the Console for Warning/Error levels, but it seems the standard log formats defined in RFC5424 or RFC3164 are not recognized. All fields except PRI go to the log message instead of being used as timestamp, hostname, etc.

Two questions:

  • What is the recommendation for logging to the unified log from Python?
  • Is there a documentation for logging via Unix domain sockets, and if not, what format of the log message us supported for that?

I found … that this can be done by writing to a UNIX domain socket at /var/run/syslog using UDP datagrams [1]

it seems the standard log formats defined in RFC5424 or RFC3164 are not recognized.

Correct. This mechanism is completely unrelated to the traditional Unix-y syslog protocol.

Rather, this is known as “simple logging” and it’s intended to be used by code that’s unable to use libSystem. For this reason the core code exists in libplatform. If you’re curious, you can go rummaging around in the Darwin open source.

However, as always, the fact that something is in Darwin doesn’t make it API. From a Python perspective this is most definitely not the droid you’re looking for. My advice is that you create a module that bridges to the native <os/log.h> API.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

[1] I’m just being nitpicky here. UDP is not a Unix domain sockets protocol.

Thanks much. That helps!

Andy

Hi Quinn. I have to come back to this. I tried to implement a Python extension module, and got stuck at the point where the os_log...() functions require that the format string must be a C string literal. Of course, when you wrapper such a function into another programming language such as Python, the format string comes in as a string object, and that cannot become a C string literal.

The code snippet from the extension module is (simplified):

static PyObject *macos_oslog_os_log_error(PyObject *self, PyObject *args) {
    const char * py_format = NULL;
    if (!PyArg_ParseTuple(args, "s", &py_format)) {
        return NULL;
    }
    os_log_with_type(OS_LOG_DEFAULT, OS_LOG_TYPE_ERROR, py_format);
    Py_RETURN_NONE;
}

The compile errors are (the first one seems to be the relevant one):

src/macos_oslog.c:111:5: error: static_assert failed due to requirement '__builtin_constant_p(py_format)' "format/label/description argument must be a string constant"
    os_log_with_type(OS_LOG_DEFAULT, OS_LOG_TYPE_ERROR, py_format);
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/Library/Developer/CommandLineTools/SDKs/MacOSX11.sdk/usr/include/os/log.h:181:9: note: expanded from macro 'os_log_with_type'
        OS_LOG_CALL_WITH_FORMAT(_os_log_impl, \
        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/Library/Developer/CommandLineTools/SDKs/MacOSX11.sdk/usr/include/os/trace_base.h:87:28: note: expanded from macro 'OS_LOG_CALL_WITH_FORMAT'
        OS_LOG_PRAGMA_PUSH OS_LOG_STRING(LOG, _os_fmt_str, fmt); \
                           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/Library/Developer/CommandLineTools/SDKs/MacOSX11.sdk/usr/include/os/trace_base.h:78:9: note: expanded from macro 'OS_LOG_STRING'
        _Static_assert(__builtin_constant_p(_str), \
        ^              ~~~~~~~~~~~~~~~~~~~~~~~~~~
src/macos_oslog.c:111:5: error: array initializer must be an initializer list or string literal
/Library/Developer/CommandLineTools/SDKs/MacOSX11.sdk/usr/include/os/log.h:181:9: note: expanded from macro 'os_log_with_type'
        OS_LOG_CALL_WITH_FORMAT(_os_log_impl, \
        ^
/Library/Developer/CommandLineTools/SDKs/MacOSX11.sdk/usr/include/os/trace_base.h:87:47: note: expanded from macro 'OS_LOG_CALL_WITH_FORMAT'
        OS_LOG_PRAGMA_PUSH OS_LOG_STRING(LOG, _os_fmt_str, fmt); \
                                              ^
src/macos_oslog.c:111:57: error: os_log() format argument is not a string constant
    os_log_with_type(OS_LOG_DEFAULT, OS_LOG_TYPE_ERROR, py_format);
                                                        ^~~~~~~~~
/Library/Developer/CommandLineTools/SDKs/MacOSX11.sdk/usr/include/os/log.h:182:55: note: expanded from macro 'os_log_with_type'
                (&__dso_handle, _log_tmp, _type_tmp), format, ##__VA_ARGS__); \
                                                      ^~~~~~
/Library/Developer/CommandLineTools/SDKs/MacOSX11.sdk/usr/include/os/trace_base.h:88:99: note: expanded from macro 'OS_LOG_CALL_WITH_FORMAT'
        uint8_t _Alignas(16) OS_LOG_UNINITIALIZED _os_fmt_buf[__builtin_os_log_format_buffer_size(fmt, ##__VA_ARGS__)]; \
                                                                                                  ^~~

The relevant code in os/trace_base.h is:

#define OS_LOG_STRING(_ns, _var, _str) \
        _Static_assert(__builtin_constant_p(_str), \
                "format/label/description argument must be a string constant"); \
        __attribute__((section("__TEXT,__oslogstring,cstring_literals"),internal_linkage)) \
        static const char _var[] __asm(OS_STRINGIFY(OS_CONCAT(LOS_##_ns, __COUNTER__))) = _str

Because it uses array initialization from a string, the string must be a string literal and cannot be a char* or any const variation of it. So the assertion is understandable, but not understandable is the array initialization that causes this restriction.

Update:

If I change the format string pointer to const char * const py_format, then the assertion passes, but there are still other errors:

src/macos_oslog.c:111:5: error: array initializer must be an initializer list or string literal
    os_log_with_type(OS_LOG_DEFAULT, OS_LOG_TYPE_ERROR, py_format);
    ^
/Library/Developer/CommandLineTools/SDKs/MacOSX11.sdk/usr/include/os/log.h:181:9: note: expanded from macro 'os_log_with_type'
        OS_LOG_CALL_WITH_FORMAT(_os_log_impl, \
        ^
/Library/Developer/CommandLineTools/SDKs/MacOSX11.sdk/usr/include/os/trace_base.h:87:47: note: expanded from macro 'OS_LOG_CALL_WITH_FORMAT'
        OS_LOG_PRAGMA_PUSH OS_LOG_STRING(LOG, _os_fmt_str, fmt); \
                                              ^
src/macos_oslog.c:111:57: error: os_log() format argument is not a string constant
    os_log_with_type(OS_LOG_DEFAULT, OS_LOG_TYPE_ERROR, py_format);
                                                        ^~~~~~~~~
/Library/Developer/CommandLineTools/SDKs/MacOSX11.sdk/usr/include/os/log.h:182:55: note: expanded from macro 'os_log_with_type'
                (&__dso_handle, _log_tmp, _type_tmp), format, ##__VA_ARGS__); \
                                                      ^~~~~~
/Library/Developer/CommandLineTools/SDKs/MacOSX11.sdk/usr/include/os/trace_base.h:88:99: note: expanded from macro 'OS_LOG_CALL_WITH_FORMAT'
        uint8_t _Alignas(16) OS_LOG_UNINITIALIZED _os_fmt_buf[__builtin_os_log_format_buffer_size(fmt, ##__VA_ARGS__)]; \
                                                                                                  ^~~

This time, the __builtin_os_log_format_buffer_size() macro determines the log buffer size at compile time, which again does not work with a string object whose size is known only at run time.

These implementation choices basically prevent implementing a Python wrapper. I'm not an expert in bindings to other languages, but I suppose it would run into the same issues.

Any advice on how to solve this issue?

If you agree that there is no way around this, could you please take that to the dev team for considering an approach that supports integrating this into other programming languages?

After doing some more googling, I found this: https://github.com/CocoaLumberjack/CocoaLumberjack/issues/883 Using this trick worked for me:

Python extension code:

static PyObject *macos_oslog_os_log(PyObject *self, PyObject *args) {
    const char * py_format = NULL;
    if (!PyArg_ParseTuple(args, "s", &py_format)) {
        return NULL;
    }
    os_log_with_type(OS_LOG_DEFAULT, OS_LOG_TYPE_ERROR, "%{public}s", py_format);
    Py_RETURN_NONE;
}

At this point, the format argument of the Python function must be a string without further formatting strings inside. So either this remains like that and the Python user will need to build the string to log, or I add support for Python-style formatting strings in the Python extension code, after a test with os_log_type_enabled().

For me, this seems to solve the problem. If you have better suggestions, let me know.

Still, it is a bit of a nuisance that Apple builds C interfaces that cannot be integrated into other programming languages.

Still, it is a bit of a nuisance that Apple builds C interfaces that cannot be integrated into other programming languages.

You should see the hoops that we had to jump through to get this working in Swift (-:

Seriously though, there’s a good reason for why this has to be a literal. One goal of the unified logging system is efficiency and, as a step towards that goal, it doesn’t actually record format strings in the log. Imagine this code:

os_log(.debug, log: log, "F%04x: will start", flow.flowID)
… do some stuff …
os_log(.debug, log: log, "F%04x: did start", flow.flowID)

[Sorry this is in Swift; that’s the project I have open at the moment.]

The strings F%04x: will start and F%04x: did start don’t get stored in the log. Rather, the log stores a reference to the binary and an offset within that binary to the format string. That makes the resulting log messages a lot smaller, because they only data they store [1] is the format string reference and the flow ID number.

This is important when you deal with a lot of log messages. For example, that snippet is from a Network Extension content filter provider I’m working on, so it’s recorded for every TCP flow running through the system. That can be a lot of flows, and those savings add up.

This is one of the reasons why we strongly recommend against wrapper the system log API. However, when you’re coming from Python you don’t get a lot of choice.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

[1] There’s also a bunch of metadata, like the thread ID.

Quinn, I understand the argument with storing just an ID and the variable parameters of the message, and yes I do not see a chance for Python to do that. Thanks again for your help!!

For all readers of this thread: I have meanwhile completed my work on the Python bindings for the unified logging system, and have uploaded an initial version of that to Pypi: https://pypi.org/project/macos-oslog/

Feedback is welcome, and feel free to open issues in the issue tracker of the project: https://github.com/andy-maier/macos-oslog/issues

I have in fact one more question :-)

I have read somewhere that os_release() should be used to release log objects that were allocated with os_log_create(). os_release() is a function to release memory.

In my macos-oslog Python package, I have implemented the use of os_release() in a Python os_log_release() function. It succeeds, but does not seem to actually release the log object. The released log object works after an os_release() call as it did before.

  • Is os_release() the correct function to release log objects?
  • If not, how can they be released?

Is os_release() the correct function to release log objects?

Yes, but…

OSLog objects are effectively eternal [1]. Technically this is an implementation detail of the logging system but it’s pretty well baked in there.

This is one of the reasons Your Friend the System Log advises you not to “use too many unique subsystem and context pairs”.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

[1] Consider this program:

import os.log

func f1() {
    let log = OSLog(subsystem: "sub", category: "cat")
    print(ObjectIdentifier(log))
}

func f2() {
    let log = OSLog(subsystem: "sub", category: "cat")
    print(ObjectIdentifier(log))
}

f1()
f2()

f1() and f2() always print the same value:

ObjectIdentifier(0x00000001007044a0)
ObjectIdentifier(0x00000001007044a0)
Log to unified log from Python
 
 
Q