A Foray into Mac and Python Automation

I went through a scripting spike recently in hopes of automating a workflow. This spike led me down an even deeper rabbit hole: AppleScript/AppleEvents and Python GUI automation.

Backstory

My software development team was asked to help automate a workflow for a client. The goal: automate downloading an image and loading it into an audio visualizer, Synesthesia.

Synesthesia display

Part of this workflow involves clicking a media refresh button to load the newly downloaded image. Additionally, this workflow would be performed on a MacBook and shouldn’t take any longer than it would to manually perform the same operations.

Our first swipe at automating relied heavily on the coordinates of the media refresh button. This implementation worked initially, but was brittle. If someone moved the Synesthesia window, the coordinates would be off, the refresh never toggled, and no new media would appear. The client would be left wondering: did I actually download that image? Do I need to click the refresh button?

AppleScript and AppleEvents

Again, a challenging part of automating this workflow is clicking the media refresh button to auto-populate any new media. Synesthesia is GUI heavy — lots of buttons.

Communicating through application events

When we initially attempted to interact with Synesthesia, we found that it currently has no API to interact with. This prompted me to start looking at other application-level ways to communicate with Synesthesia. In particular, I found AppleEvents and AppleScript.

AppleEvents are messages between processes in macOS while AppleScript is a language used to facilitate and automate that communication. Here’s part of a bash script using AppleScript:


osascript <<EOF
tell application "System Events"
tell front window of application process "Synesthesia"
set uiElems to entire contents
end tell
end tell
EOF

The above script does the following:

  • Use osascript utility to execute AppleScript from a bash file, and here document syntax to run multiple commands.
  • Send commands to System Events.
  • Target the front window of the app Synesthesia.
  • Set the variable uiElems to the entire contents of the window.
  • End the block of commands for the front window of Synesthesia.
  • End the block of commands for System Events.

The problem with this script is that I’m asking for the entirety of the UI’s guts, and Synesthesia is GUI intensive. Here’s a sampling of the output:

applescript gumbo

That’s thoroughly discouraging. In this output gumbo, knowing what element to click on is guesswork. The conclusion: interacting with elements in Synesthesia programmatically is problematic if you are trying to tell macOS to click an element by name. There are few elements with actual labels, and none of them say refresh.

So, using Mac’s Accessibility Inspector, I was able to find a class list for the refresh button.

Accessibilty Inspector

My original hope was that I could do something as simple as this:

osascript <<EOF
tell application "System Events"
    tell front window of application process "Synesthesia"
       click UI element "gallery-module__refreshPanel___29kAI"
    end tell
end tell
EOF

TL;DR I failed to click “gallery-module__refreshPanel__29kAI”. I came up with many applescripts that traversed these UI elements in search of anything that mentioned refresh. Eventually, I realized that all this traversing took down our time goal in flames.

What’s the point of automation if it doesn’t fit your workflow or save you time?

Hunting Actions and Operations

Next, I sought out discoverable actions or operations related to Synesthesia in any way I could. Most helped us gain more insight into the problem at hand. Here are some of them:

  • Activity Monitor — I could see Synesthesia as a process, but that’s the extent of it.
  • Running the Synesthesia executable from the command line — (./Synesthesia from the MacOS folder in the app file) provided no discernible results beyond some error logs.
  • Stream live system logs in real-time — I found this command: log stream --predicate ‘process == "Synesthesia"' . But, Synesthesia, as an audio visualizer produces many logs; like a wall of logs that had some process names (e.g., com.apple.WebKit:ProcesSuspension ). Even with paired-down logs, I could see that toggling refresh gave me no real visibility into operation or action names.

sdef

In my haste to discover a programmatic refresh, I stumbled onto this gem: sdef, or scripting definition. This one is worth its own header.

I attempted to run sdef /Applications/Synesthesia.app but got a -192 error, which means an app doesn’t support scripting 🫡. However, you get an entirely different output when you run the following command:

sdef /System/Library/CoreServices/Finder.app

The output is in XML format, and, using AppleScript, you primarily interact with the names of classes, properties, and commands as defined in the sdef output. code attributes, like aevtquit, represent underlying Apple Event codes used by macOS, and are generally not used in AppleScript. Instead, you’d use the human-readable names, like “quit”.

<command name="quit" code="aevtquit" description="Quit the Finder"/>

For example, to quit an application, you would write:

tell application "Finder"
    quit
end tell

Note to self: keep this in your pocket for future use.

sdef and my other rabbit hole discoveries didn’t give me the results I was hoping for, but it did give me new ideas for my own automated workflows and pushed me toward GUI automation.

Python GUI Automation

After the AppleScript trials, I decided our best bet was to tweak our original implementation. In that implementation, we use PyAutoGUI to click on certain coordinates, which is great if we know those coordinates.

So, the new approach is to always set the window to a specific size and position. Thankfully, my meanderings through AppleScript told me I could do exactly that. Adding it to our current Python implementation required PyObjC and AppKit.

  • PyObjC is a Python binding to the Objective-C runtime, which gives us interaction with macOS libraries and frameworks.
  • AppKit is a macOS framework that gives us access to classes and interfaces needed for using AppleScript.

    With AppKit, we can interact with a macOS workspace with NSWorkspace, a class in AppKit. The below function looks through a list of running apps to check if the one we want is available, otherwise, we return None.

    def find_window_by_app_name(app_name):
        # Get list of running apps
        running_apps = NSWorkspace.sharedWorkspace().runningApplications()
        for app in running_apps:
            # Check if app name matches
            if app.localizedName() == app_name:
                return app
        return None

    In the second function, set_window_size_and_position, we (hopefully) find the app, activate it if found, and use PyObjC, or subprocess, to run AppleScript that moves, restores if minimized, and sets window height and width.

    def set_window_size_and_position(app_name, x, y, width, height):
        app = find_window_by_app_name(app_name)
        if app:
            # Bring app to foreground, make it active
            app.activateWithOptions_(NSApplicationActivateIgnoringOtherApps)
    
            print(
                f"Moving and resizing window for '{app_name}'."
            )
            # Use AppleScript to move, restore, and resize window
            apple_script = f"""
            tell application "System Events" to tell process "{app_name}"
                set frontmost to true
                repeat with win in windows  
                      if value of attribute "AXMinimized" of win is true then               
                          set value of attribute "AXMinimized" of win to false          
                       end if       
                      end repeat        
                  set frontmost to true
                set position of window 1 to {{{x}, {y}}}
                set size of window 1 to {{{width}, {height}}}
            end tell
            """
            # Use PyObjC to run AppleScript
            subprocess.run(["osascript", "-e", apple_script])
        else:
                # Otherwise, let user know app not found
            print(f"Application '{app_name}' not found.")

    Through this functionality, Synesthesia’s location, window size, and button coordinates are consistently placed every time we run our script.

    Where do we go from here?

    Based on my findings, my team has plenty of ideas on how we can continue improving this workflow such as notifications for the user, additional app placement, and clicking play on a curated playlist. Beyond that, the exercise gave me quite a few new tools in my tool belt: an audio visualizer, AppleScript, macOS tools like Accessibility Inspector, sdef, and cool Python libraries.

    Conversation
  • URL says:

    This blog post was very well-written. Thanks for sharing!

  • Join the conversation

    Your email address will not be published. Required fields are marked *