Java and AppleScript: Speech Part 3

Here’s a video of me using speech recognition on a Full Tilt play money table:

As related here and here, I recently hooked Java into Mac OS X’s Speech Recognition API and made the mouse click on buttons in response to spoken commands.

Today I investigated ways to find the poker room table buttons. I decided to limit what I’m doing for now to Full Tilt Poker “classic” theme.

It seems that in a Full Tilt window, the buttons are always at fixed proportions. If I resize the window, the buttons resize too, and remain always at the same location in proportion to the window size. Good.

Now I have to work out the location and size of the current Full Tilt window. I discovered yesterday that as of Java 6, you can run AppleScript directly within Java. This makes the task extremely easy. It’s not particularly efficient but it works. The following Java method uses AppleScript to fetch the bounds of the main window of the front most application:


public Rectangle getMainWindowBounds() {
try {
String script =
"tell application \"System Events\"\n" +
" set theprocess to the first process whose frontmost is true\n" +
" set thewindow to the value of attribute \"AXMainWindow\" of theprocess\n" +
" set thelocation to the position of thewindow\n" +
" set thesize to the size of thewindow\n" +
" return thelocation & thesize\n" +
"end tell";

ScriptEngine engine = new ScriptEngineManager().getEngineByName("AppleScript");

ArrayList list = (ArrayList) engine.eval(script);
int x = ((Long) list.get(0)).intValue();
int y = ((Long) list.get(1)).intValue();
int width = ((Long) list.get(2)).intValue();
int height = ((Long) list.get(3)).intValue();
return new Rectangle(x, y, width, height);

} catch (ScriptException e) {
throw new RuntimeException(e);
}
}

This was the missing piece of the puzzle. The centre of button 1 – either Fold or Check – is (window.x + window.width * 0.689, window.y + window.height * 0.946)

Now it is possible for me to open a Full Tilt poker table, move and resize the window, then speak my commands. As a test I played on two tables simultaneously and had it working, although I had to manually set the focus to the window I wanted to speak to.

Finally here is the entire app. It’s low on error-checking, and assumes Full Tilt is the current front-most application. It’s a good starting point for someone who wants to take the idea further.



package com.barbarysoftware.pokercopilot.speech;

import org.rococoa.cocoa.foundation.NSAutoreleasePool;

import javax.script.ScriptEngine;
import javax.script.ScriptEngineManager;
import javax.script.ScriptException;
import javax.swing.*;
import java.awt.*;
import java.awt.event.ActionEvent;
import java.awt.event.InputEvent;
import java.util.ArrayList;

public class SpeechDemo {

public static void main(String[] args) {
Runnable runnable = new Runnable() {
public void run() {
final NSAutoreleasePool pool = NSAutoreleasePool.new_();
try {
new SpeechDemo().setUpSpeechRecognition();
} finally {
pool.release();
}
}
};
new Thread(runnable).start();


JFrame frame = new JFrame("Speech Demo");
frame.getContentPane().add(new JButton(new AbstractAction("Quit") {
public void actionPerformed(ActionEvent e) {
System.exit(0);
}
}));

frame.pack();
frame.setLocationRelativeTo(null);
frame.setVisible(true);
}

private void setUpSpeechRecognition() {
SpeechRecognizer speechRecognizer = new SpeechRecognizer();
speechRecognizer.setSpeechRecognizerListener(new SpeechRecognizerListener() {
public void didRecognizeCommand(String command) {
if (command.equals("Fold")) {
clickButton1();

} else if (command.equals("Check")) {
clickButton1();

} else if (command.equals("Call")) {
clickButton2();

} else if (command.equals("Raise")) {
clickButton3();

}
}
});

speechRecognizer.setCommands("Fold", "Check", "Raise", "Call", "Grab");
speechRecognizer.setListensInForegroundOnly(false);
speechRecognizer.setDisplayedCommandsTitle("Poker Copilot");
speechRecognizer.startListening();
}


private void clickButton1() {
clickAtPoint(getPointInWindow(0.689, 0.946));
}

private void clickButton2() {
clickAtPoint(getPointInWindow(0.820, 0.946));
}

private void clickButton3() {
clickAtPoint(getPointInWindow(0.951, 0.946));
}

private Point getPointInWindow(double xfactor, double yfactor) {
final Rectangle bounds = getMainWindowBounds();
final int x = (int) (bounds.x + bounds.width * xfactor);
final int y = (int) (bounds.y + bounds.height * yfactor);
return new Point(x, y);
}

public Rectangle getMainWindowBounds() {
try {
String script = "tell application \"System Events\"\n" +
" set theprocess to the first process whose frontmost is true\n" +
" set thewindow to the value of attribute \"AXMainWindow\" of theprocess\n" +
" set thelocation to the position of thewindow\n" +
" set thesize to the size of thewindow\n" +
" return thelocation & thesize\n" +
"end tell";

ScriptEngine engine = new ScriptEngineManager().getEngineByName("AppleScript");

ArrayList list = (ArrayList) engine.eval(script);
int x = ((Long) list.get(0)).intValue();
int y = ((Long) list.get(1)).intValue();
int width = ((Long) list.get(2)).intValue();
int height = ((Long) list.get(3)).intValue();
return new Rectangle(x, y, width, height);

} catch (ScriptException e) {
throw new RuntimeException(e);
}
}

private void clickAtPoint(Point point) {
try {
Robot robot = new Robot();
robot.mouseMove(point.x, point.y);
robot.mousePress(InputEvent.BUTTON1_MASK);
robot.mouseRelease(InputEvent.BUTTON1_MASK);

} catch (AWTException e) {
throw new RuntimeException(e);
}
}
}

What’s next? I think I’ve done as much with this as I intend to. It’s been a diversion for me during some particularly cold winter days. I’m considering putting together everything I’ve done into Google Code as an open source project as the basis of an auto-hot key app. If you’re interested in seeing this become an open source project, let me know in the comments.