Super Simple Single Instance Application in Windows using C#

There are cases in the Windows world where you only want a single instance of an application.  So, if we launch the application either from the Start menu or command line, or we launch the process from another application, we only ever want one instance – the same one.

If an instance of the application already exists, we want the application to come to the foreground.  Not only that but we want the view in that single instance to change based on how the second instance was invoked.

Much has been written on this including using .NET remoting.  I chose a far simpler way involving memory mapped files, passing command line information from the second instance to the first.  (I freely acknowledge that there is a hack component to it.)

We will begin with the essence of the operation in the main function of your application.

/// <summary>
/// The main entry point for the application.
/// </summary>
[STAThread]
static void Main(string[] args)
{
  // The helper object for single instances
  ApplicationManagement applicationManagement = new     ApplicationManagement();

  // If we have arguments, pass them to the original instance.
  // If we are creating the first instance, the same memory mapped file mechanism is used for    simplicity.
  if (args.Length > 0)
    applicationManagement.SetArguments(args);

  // if a previous instance exists..
  if (applicationManagement.DoesPreviousInstanceExist())
  {
    // Exit stage left
    applicationManagement.ShowPreviousInstance();
    return;
  }

  // in the case where we are the first instance and we don't have a previous one, then
  // default to the default view
  if (args.Length == 0)
    applicationManagement.SetArguments(new string [] {   frmMain.commandLineFlagForm1 });

  Application.EnableVisualStyles();
  Application.SetCompatibleTextRenderingDefault(false);

  // pass the application management object to the main form
  frmMain main = new frmMain();
  main.applicationManagement = applicationManagement;
  Application.Run(main);
}

Here is the operation truth table with this code.

command line new instance created previous instance exists
no arguments show default view do nothing (but set to top of Z-order)
has arguments set view to the argument set view to the argument

Notice that if a previous instance exists, we want to bring the previous instance window to the top of the Z-order.  The previous instance itself can not go to the foreground (without attaching to thread input of the top order window but we are not doing this).  While a new process is being created, until the first window of the application is created, the new process can send any other window of any other process to the top of the Z-order.  This is a peculiarity of Windows.

Now a hacky part…  In the main form, there is a timer that periodically checks for new arguments to appear.

private void tmrArguments_Tick(object sender, EventArgs e)
{
 string[] arguments = applicationManagement.GetArguments();
 if (arguments.Length > 0)
 {
 switch (arguments[0])
 {
 case commandLineFlagForm1:
 Form frm = ViewChildForm(typeof(frmForm1));
 if (arguments.Length == 3)
 (frm as frmForm1).SetSelectedProperty(arguments[1], arguments[2]);
 frm.WindowState = FormWindowState.Maximized;
 break;
 case commandLineFlagForm2:
 ViewChildForm(typeof(frmForm2)).WindowState = FormWindowState.Maximized;
 break;
 }
 }
}

Essentially, the form’s timer will periodically check to see if there are new arguments.  In this example, the command line can specify a form 1 or a form 2.  If form 1, it can pass additional information in as well.

So now we will get to the handling of the previous instance and bringing it to life and then we will deal with passing command line information.

For single instance, the applicationManagement class that you see handles the single instance.  The constructor determines if a previous instance exists.

public ApplicationManagement()
{
  Process currentProcess = Process.GetCurrentProcess();
  m_previousInstance = (from process in Process.GetProcesses()
                         where
                         process.Id != currentProcess.Id &&
                         process.ProcessName.Equals(
                         currentProcess.ProcessName,
                         StringComparison.Ordinal)
                         select process).FirstOrDefault();
  CreateArgumentStream();
}

/// <summary>
/// Does the previous instance exists?
/// </summary>
/// <returns></returns>
public bool DoesPreviousInstanceExist()
{
  return m_previousInstance != null;
}

/// <summary>
/// Shows the instance of the specified window handle
/// </summary>
/// <param name="windowHandle">The window handle.</param>
private void ShowInstance(IntPtr windowHandle)
{
  ShowWindow(windowHandle, WindowShowStyle.ShowNormal);
  EnableWindow(windowHandle, true);
  SetForegroundWindow(windowHandle);
}

/// <summary>
/// Shows the previous instance.
/// </summary>
public void ShowPreviousInstance()
{
  ShowInstance(m_previousInstance.MainWindowHandle);
}

You can see the methods where we determine if a previous instance exists and also showing the previous instance.  The ShowWindow, EnableWindow, and SetForegroundWindow API calls can be optained here:  http://www.pinvoke.net/.

Finally, the here is the code in the applicationManagement class that handles passing through the memory mapped buffer.

/// <summary>
/// The map name used for passing arguments through the MMF
/// </summary>
private const string mapName = "MySecretMapName";

/// <summary>
/// The argument stream length in bytes
/// </summary>
private const int argumentStreamLength = 512;

/// <summary>
/// The argument reference that will be used to pass arguments through a memory mapped file.
/// </summary>
private MemoryMappedFile m_arguments = null;

/// <summary>
/// The accessor to pass arguments through the memory mapped file
/// </summary>
private MemoryMappedViewAccessor m_argumentViewAccessor = null;
/// <summary>
/// Creates the argument stream.
/// </summary>
private void CreateArgumentStream()
{
  m_arguments = MemoryMappedFile.CreateOrOpen(mapName, argumentStreamLength, MemoryMappedFileAccess.ReadWrite);
  m_argumentViewAccessor = m_arguments.CreateViewAccessor();
  // Set up the buffer with nulls
  ClearArguments();
}

/// <summary>
/// Clears the arguments by setting everything to nulls which is still a valid string but has no meaning
/// </summary>
private void ClearArguments()
{
  byte[] buffer = new byte[argumentStreamLength];
  m_argumentViewAccessor.WriteArray<byte>(0, buffer, 0, buffer.Length);
}

/// <summary>
/// Gets the arguments from the memory mapped file and convert back to a string array.
/// We are using a less than pristine synchronization method of looking for any nulls in the string.
/// This method does however work as we are looking for the last null to be removed which should happen
/// in SetArguments.
/// </summary>
/// <returns></returns>
public string[] GetArguments()
{
  byte[] buffer = new byte[argumentStreamLength];

  m_argumentViewAccessor.ReadArray<byte>(0, buffer, 0, buffer.Length);
  // get the string of arguments from the buffer
  string args = Encoding.Unicode.GetString(buffer);
  // if there are any nulls, then we do nothing but return an empty string
  if (args.IndexOf('\0') != -1)
    return new string[0];

  // if there are no more nulls the we have something to evaluate
  args = args.Trim(); // remove the spaces
  ClearArguments();
  // split the string based on our marvelous separator
  return args.Split(new string[] { argumentSeparator }, StringSplitOptions.RemoveEmptyEntries);
}

/// <summary>
/// Sets the arguments from the command line. This is a poor mans way of doing this. A more "pristine" way would be to
/// serialize and deserialize the command line argument string array.
/// </summary>
/// <param name="args">The arguments.</param>
public void SetArguments(string[] args)
{
  // puts the nulls into the MMF
  ClearArguments();

  // Removes all the null when it is finally written to the MMF
  string arguments = String.Join(argumentSeparator, args).PadRight(argumentStreamLength, ' ');
  byte[] buffer = Encoding.Unicode.GetBytes(arguments);
  m_argumentViewAccessor.WriteArray<byte>(0, buffer, 0, buffer.Length);
}

This is a really simple information passing with some inherent synchronization although it is a bit hacky.  Essentially, the memory mapped buffer is full of nulls.  When the previous instance is going to put the command line arguments into the MMF, it copies all the command line arguments converting from a string array to a joined single string with a clever separator (lame).  Then it pads to the length of the buffer with spaces.

The primary instance will wait until there are no more nulls in the buffer before evaluating.  In this way, we achieve some form of synchronization.  After the complete string is read and split to the original string array, the buffer is cleared out.  (This is also a clear last one in wins scenario.)

So there it is.  If there are questions, post a comment.  I would love to see something other than “buy Nike shoes written in Chinese”.

Restricting input on winform text boxes

How do you get a text box to only support positive floating point numbers on a Windows Form text box? (I know. I know. This is old school. However, it does pay the bills.)

One way that I thought was appealing was to implement the KeyDown event and control the key-presses; like this:

private void txtBowelMovementsPerDay_KeyDown(object sender, System.Windows.Forms.KeyEventArgs e)
{
    // Allow numbers from keyboard
    if (e.KeyData >= Keys.D0 && e.KeyData <= Keys.D9)         return;     // Allow numbers from number pad     if (e.KeyData >= Keys.NumPad0 && e.KeyData <= Keys.NumPad9)
        return;

    // Allow periods from keyboard and number pad
    if (e.KeyData == Keys.Decimal || e.KeyData <= Keys.OemPeriod)
    {
        // but only one
        if (!txtBowelMovementsPerDay.Text.Contains('.'))
            return;
    }

    // Allow alt and shift to work
    if (e.Alt || e.Shift)
        return;

    // Allow the special keys to work
    switch (e.KeyData)
    {
        case Keys.End:
        case Keys.Enter:
        case Keys.Home:
        case Keys.Back:
        case Keys.Delete:
        case Keys.Escape:
        case Keys.Tab:
        case Keys.Left:
        case Keys.Right:
            return;
    }

    // Invalid input
    e.SuppressKeyPress = true;
    e.Handled = true;
}

A friend pointed out that while this was cool (ok, I may have change the tone there), it doesn’t internationalize well because a) it assumes the US keyboard layout and b) it assumes a ‘.’ is always the radix point. Bad, bad, bad.

Instead he turned me onto letting Windows do its job and checking the validity of the float to begin with and enabling the user to press “OK”. Like this…

private void txtBowelMovementsPerDay_TextChanged(object sender, EventArgs e)
{
    btnOk.Enabled = TrackSpacingMicronsTextBoxIsValid();
}

private bool TrackSpacingMicronsTextBoxIsValid()
{
    float trackSpacingMicrons;

    // is it a valid float?
    if (float.TryParse(txtBowelMovementsPerDay.Text, out trackSpacingMicrons))
    {
        // is it a positive number?
        if (trackSpacingMicrons <= 0)
            return false;
        return true;
    }
    else
        return false;
}

This is a much better approach and facilitates i18n. While Winforms is old-school, the problem is nicely solved in WPF.

<TextBox Name="txtBowelMovementsPerDay" TextWrapping="Wrap" Text="{Binding Path=BowelMovementsPerDay, Mode=TwoWay, 
    UpdateSourceTrigger=PropertyChanged,StringFormat={}{##.##}}"></TextBox>

For systems where you must make a decision to validate or preempt keystrokes, validation will always be best. By validation, I mean that you do not let the user continue unless all the data is correct. This is true with web page development as well as Javascript will parse floats based on locale.

Stupid Simple File Server from Spare Parts

Another installment of things I like because the just work.

I frequently have the need to spin up a file server, either for temporary use or for longer term storage.
Recently I was migrating a system from one place to another and needed a large amount of fast temporary storage.
At home I have a file server for mass storage of music, videos etc.

There are lots of solutions to this. Could always just use an old Win XP box, or set up a Linux server using your favorite distro. Then there are free software systems, FreeNAS is a good example.

One I like is Server Elements.
They have a line of products including one that boots from a floppy! Hardware requirements for all of the products are very modest. Basically take some old PC, stuff it full of old disks, create a bootable CD, or Thumb Drive and off you go. Prices range from $10 – $35 for the 64bit product with some media streaming capabilities.

For quick set up – Server Elements is really great and worth the small price.
If you want more features, FreeNAS is a very good choice.

Here is an excellent article on the topic.
http://www.smbitjournal.com/2012/04/choosing-an-open-storage-operating-system/

Swiss Army Knife of SMTP Servers

Just wanted to give a shout out to my favorite SMTP Server.
When you find yourself needing a robust, easy to configure and support SMTP IMAP/ POP3 server that can handle about any messaging need you could think of … check out MDaemon from the folks at Alt-N Technologies.

How good is it? Well, not that long ago, Research In Motion (RIM), you know, the BlackBerry people, bought? the company.
(not positive about the change in ownership as I don’t work for the company)
During this time BlackBerry support was added. Then the market turned, and Alt-N became independent again.

I’ve been using MDaemon for many years now in several environments. I’ve tried others, but keep going back to this one. It just works. Rarely do I find a messaging problem that it doesn’t handle. You *nix fans will even like it. Although it runs on Windows machines, it has a *nix feel to it as everything is kept in files.

beans

Simple Python JSON server based on jsonrpclib

I needed a simple python JSON server executing in its own thread but that was easily extensible.   Let’s get right to the base class code (or super class for those who build down).

#! /usr/bin/python

import threading

import jsonrpclib
import jsonrpclib.SimpleJSONRPCServer

class JsonServerThread (threading.Thread):
  def __init__(self, host, port):
    threading.Thread.__init__(self)
    self.daemon = True
    self.stopServer = False
    self.host = host
    self.port = port
    self.server = None

  def _ping(self):
    pass

  def stop(self):
    self.stopServer = True
    jsonrpclib.Server("http://" + str(self.host) + ":" + str(self.port))._ping()
    self.join()

  def run(self):
    self.server = jsonrpclib.SimpleJSONRPCServer.SimpleJSONRPCServer((self.host, self.port))
    self.server.logRequests = False
    self.server.register_function(self._ping)

    self.addMethods()

    while not self.stopServer:
      self.server.handle_request()
    self.server = None

  # defined class definitions

  def addMethods(self):
    pass

So the idea is simple,   Derive a new class from this and implement the addMethods method and the methods themselves.

#! /usr/bin/python

import jsonServer

class JsonInterface(jsonServer.JsonServerThread):
  def __init__(self, host, port):
    jsonServer.JsonServerThread.__init__(self, host, port)
    self.directory = directory

  def addMethods(self):
    self.server.register_function(self.doOneThing)
    self.server.register_function(self.doAnother)

  def doOneThing(self, obj):
    return obj

  def doAnother(self):
    return "why am I doing something else?"

In the derived class, implement the methods and register them in addMethods.  That is all.  Now we can worry simply about implementation.  Be aware of any threading synchronization of exception handling.  Jsonrpclib takes care of exception handling as well and converts it into a JSON exception.

One last item of note.  In the base class, the stop method is interesting.  Since handle_request() is a blocking call in the thread, we need to set the “stop” flag and make a simple request.  The _ping method does this for us.  Then we join on the thread waiting for it to end gracefully.

The jsonrpclib is a very useful library and well done.  By the way, this example is for Python 2.7.  On Ubuntu 14.04, you can install this using “apt-get install python-jsonrpclib”.

Pulling Documents for Searching

In a prior post, I noted how to set up elasticsearch with apache2.  In this post, we will look at how to cache a set of files on your web server  from a windows share and index them.

To do this, we need to do the following steps:

  1. Initialize the index the first time.
  2. Mount a share.
  3. Rsync the data between the machines.
  4. Get the files that exist on the SMB share.
  5. Read what has been indexed.
  6. Diff the lists from steps 3 and 4.
  7. Index the new files on the share.
  8. Delete (the index and file) the files that no longer exist on the share.

By the way, there was a lot done in python 2.7 (as opposed to python 3x in some other posts I have).

Initialize the Index

The following script will “reset” the index and create it new.

#! /usr/bin/python

import httplib 
import binascii
import os
import glob
import socket

import hostinfo

def connRequest(conn, verb, url, body = None):
    if body == None:
        conn.request(verb, url)
    else:
        conn.request(verb, url, body)
    return conn.getresponse().read()

def connInitialize(conn):
    print connRequest(conn, 'DELETE', hostinfo.INDEX)
    print connRequest(conn, 'PUT', hostinfo.INDEX, '{  "settings" : { "index" : { "number_of_shards" : 1, "number_of_replicas" : 0 }}}') 
    print connRequest(conn, 'GET', '/_cluster/health?wait_for_status=green&pretty=1&timeout=5s' )
    print connRequest(conn, 'PUT', hostinfo.INDEX + '/attachment/_mapping', '{  "attachment" : {   "properties" : {      "file" : {        "type" : "attachment",        "fields" : {          "title" : { "store" : "yes" },          "file" : { "term_vector":"with_positions_offsets", "store":"yes" }        }      }    }  }}' )

def connRefresh(conn):
    print connRequest(conn, 'POST', '/_refresh')

socket.setdefaulttimeout(15)
conn = httplib.HTTPConnection(hostinfo.HOST)
connInitialize(conn)
connRefresh(conn)

Mount a SMB share

On Ubuntu, you will need to install cif-utils:  ”sudo apt-get install cifs-utils”.

Once done, you can mount it by using the following command.  Choose your own mount point obviously and be prepared with your domain password.

sudo mount -t cifs //10.0.4.240/General /mnt/cifs -ousername=maksym.shyte,ro

Rsync Between Server and SMB Share

The easiest way to do this is to create a file list that you want to search for.  Then use that list to rsync with.  This leaves you with copied files with efficiency and a text file list of the files on the SMB share.

function addToList {
  find "$1" -name \*.pdf -o -name \*.doc -o -name \*.docx -o -name \*.xls -o -name \*.xlsx -o -name \*.ppt -o -name \*.pptx -o -name \*.txt | grep -v ".AppleDouble" | grep -v "~$" >> "$2"
}

cd /mnt/cifs

addToList . $currentPath/rsynclist.txt
#addToList ./Some\ Directory $currentPath/rsynclist.txt

rsync -av --files-from=rsynclist.txt /mnt/cifs /var/www/search/data

Read the Index

To read the index, the following script will pull the indexes out and write them to a file.  This will include the name of the document and the key.  You will need to take the step of revolving the path from the previous file list with this index as they are related by the source and destination directory passed to rsync.

#! /usr/bin/python

import httplib 
import json
import sys
import os
import codecs

import hostinfo

argc = len(sys.argv)
if argc != 2:
    print os.path.basename(sys.argv[0]), ""
    sys.exit(-1)

indexFileName = sys.argv[1]

def connRequest(conn, verb, url, body = None):
    if body == None:
        conn.request(verb, url)
    else:
        conn.request(verb, url, body)
    return conn.getresponse().read()

conn = httplib.HTTPConnection(hostinfo.HOST)
data = json.loads(connRequest(conn, 'GET', hostinfo.INDEX + '/_search?search_type=scan&scroll=10m&size=10', '{"query":{"match_all" :{}}, "fields":["location"]}' ))

print data
total = data["hits"]["total"]

#scroll session id, used to request the next batch of data
scrollId = data["_scroll_id"]
counter = 0; 

data = json.loads(connRequest(conn, 'GET', hostinfo.SITE + '/_search/scroll?scroll=10m', scrollId))

#print data

f = codecs.open(indexFileName, "w", "utf8")

while len(data["hits"]["hits"]) > 0:
    for item in data["hits"]["hits"]: 
        f.write(item["fields"]["location"][0] + ',' + item["_id"] + '\n')
        f.flush()

    counter = counter + len(data["hits"]["hits"])
    print "Reading Index:", counter, "of", total

    scrollId = data["_scroll_id"]
    resp = connRequest(conn, 'GET', hostinfo.SITE + '/_search/scroll?scroll=10m', scrollId)
    #print resp
    data = json.loads(resp)

f.close()

Diff the File List and the Index List

Next we need to diff the two.  We want to know the files we need to index and the files we want to delete.  The following script does that (presuming that the lists have been modified to point at the same directory – i.e. /var/www/search/data).  Out comes an “add” text file and a “delete” text file.

#! /usr/bin/python

import sys
import os

argc = len(sys.argv)
if argc != 5:
    print os.path.basename(sys.argv[0]), "   "
    sys.exit(-1)

def createMap(filename):
    ret = {}
    f = open(filename)
    lines = f.readlines()
    f.close()
    for line in lines:
        line = line.replace('\n','')
        split = line.split(',', 1)
        key = split[0]
        ret[key] = line
    return ret

fileMap = createMap(sys.argv[1])
indexMap = createMap(sys.argv[2])

# if the entry is in fileMap but not indexMap, it goes into the add file
# if the entry is in indexMap but not fileMap, it goes into the delete file
add = {}

for key in fileMap:
    if indexMap.has_key(key):
        del indexMap[key]
    else:
        add[key] = fileMap[key]

f = open(sys.argv[3], "w")
for key in add:
    f.write(add[key] + '\n');
f.close()

f = open(sys.argv[4], "w")
for key in indexMap:
    f.write(indexMap[key] + '\n');
f.close()

Add to the Index

Next we iterate through all the files in the “add” list.

#! /usr/bin/python

import httplib 
import binascii
import sys
import os
import socket

import hostinfo

argc = len(sys.argv)
if argc != 3:
    print os.path.basename(sys.argv[0]), " "
    sys.exit(-1)

rootFsDir = sys.argv[2] 

def connRequest(conn, verb, url, body = None):
    if body == None:
        conn.request(verb, url)
    else:
        conn.request(verb, url, body)
    return conn.getresponse().read()

def connInitialize(conn):
    print connRequest(conn, 'DELETE', hostinfo.INDEX)
    print connRequest(conn, 'PUT', hostinfo.INDEX, '{  "settings" : { "index" : { "number_of_shards" : 1, "number_of_replicas" : 0 }}}') 
    print connRequest(conn, 'GET', '/_cluster/health?wait_for_status=green&pretty=1&timeout=5s' )
    print connRequest(conn, 'PUT', hostinfo.INDEX + '/attachment/_mapping', '{  "attachment" : {   "properties" : {      "file" : {        "type" : "attachment",        "fields" : {          "title" : { "store" : "yes" },          "file" : { "term_vector":"with_positions_offsets", "store":"yes" }        }      }    }  }}' )

def connRefresh(conn):
    print connRequest(conn, 'POST', '/_refresh')

def connAddFile(conn, filename, rootFsDir):
    title = os.path.basename(filename)
    location = filename[len(rootFsDir):]

    with open(filename, 'rb') as f:
        data = f.read()

    if len(data) > hostinfo.LARGEST_BASE64_ATTACHMENT:
        print 'Not indexing because the file is too large', len(data)
    else:
        print 'Indexing file size', len(data)
        base64Data = binascii.b2a_base64(data)[:-1]
        attachment = '{ "file":"' + base64Data + '", "title" : "' + title + '", "location" : "' + location + '" }'
        print connRequest(conn, 'POST', hostinfo.INDEX + '/attachment/', attachment)

socket.setdefaulttimeout(30)
conn = httplib.HTTPConnection(hostinfo.HOST)
#connInitialize(conn)

f = open(sys.argv[1])
lines = f.readlines()
f.close()

idx = 0

rootFsDir = rootFsDir + '/'

for line in lines:
    line = line.replace('\n', '')
    idx = idx + 1
    filename = rootFsDir + line
    print idx, filename
    try:
        connAddFile(conn, filename, rootFsDir)
    except Exception, e:
        print str(e)
        conn = httplib.HTTPConnection(hostinfo.HOST)  

connRefresh(conn)

Delete the Files Not Needed

Finally, we delete the index and physical files no longer needed.

#! /usr/bin/python

import httplib 
import binascii
import sys
import os
import socket

import hostinfo

argc = len(sys.argv)
if argc != 3:
    print os.path.basename(sys.argv[0]), " "
    sys.exit(-1)

def connRequest(conn, verb, url, body = None):
    if body == None:
        conn.request(verb, url)
    else:
        conn.request(verb, url, body)
    return conn.getresponse().read()

def connRefresh(conn):
    print connRequest(conn, 'POST', '/_refresh')

def connDeleteFile(conn, index):
    print connRequest(conn, 'DELETE', hostinfo.INDEX + '/attachment/' + index)

socket.setdefaulttimeout(30)
conn = httplib.HTTPConnection(hostinfo.HOST)

f = open(sys.argv[1])
lines = f.readlines()
f.close()

idx = 0

for line in lines:
    line = line.replace('\n', '')
    idx = idx + 1
    split = line.split(',')
    filename = split[0]
    index = split[1]
    print "Delete:", idx, filename, index
    try:
        connDeleteFile(conn, index)
    except Exception, e:
        print str(e)
        conn = httplib.HTTPConnection(hostinfo.HOST)  

    try:
    	os.remove(sys.argv[2] + '/' + filename)    
    except:
        pass

connRefresh(conn)

There it is.  I have all these steps including resolving the path between the file list and the index list.  One further thing to note is that the hostinfo file referenced by the python scripts look like this:

#! /usr/bin/python

HOST = '127.0.0.1:9200'
SITE = ''

INDEX = SITE + '/basic'

LARGEST_BASE64_ATTACHMENT = 50000000

 

A Search Engine for Office Documents

Have you ever worked at a place where there was a mass of files and documents on  a share and even old timers forget where important documents are?

Search by file name stinks and SharePoint has been another excuse to dump stuff that gets lost.

So I decided to figure out an easy way to get a content search engine up looking through the files on a share.    I found a solution.  It isn’t pristine for these reasons.

  1. Browsers can’t link to files on a share for obvious security reasons.
  2. For reason one, the decision was made to copy searchable documents onto the web server.  This is time consuming to transfer and duplicates information but the documents are served successfully.
  3. For reason two, it would be possible to add an server plugin that reads and delivers a file on a share.  Just haven’t done that yet.

So we will start with what we have and consider changing it later.

The basis for this will be Ubuntu 12.04 LTS.  Why?  Because I have such a machine handy and it is 9 years old.  This will be based on all the wonderful work of elasticsearch and Lucene.

So, here are the steps.  Remember, this is a bit hacky.

  1. Install apache2.  (In the case of Ubuntu, it is “sudo apt-get install apache2″.)
  2. Install openjdk-7-jre-headless.  (“sudo apt-get isntall openjdk-7-jre-headless”).
  3. Download elasticsearch (from elasticsearch.org – the .com site takes you to pay-for products).  Because I am using Ubuntu, I thought I would use the apt repository.
  4. Follow the steps to start elasticsearch – in my case listed on the web site.  Be advised that elasticsearch binds to all interfaces tp a free port between 9200 and 9300.  We will assume that the port is 9200 as it is in my case.  However, it probably should only bind to a port on localhost or at least, the security should be evaluated to make sure it complies with what you need.
  5. We will need two plugins.  You can install them from you elasticsearch/bin location.  In my case it was /usr/share/elasticsearch/bin/plugin.
    bin/plugin -install elasticsearch/elasticsearch-mapper-attachments/2.0.0
    bin/plugin -install de.spinscale/elasticsearch-plugin-suggest/1.0.1-2.0.0

    Restart elasticsearch. (“sudo service elasticsearch restart”).  You will also need to verify the versions of these plugins.

  6. For apache2, make sure to enable the proxy, proxy_http, and ssl modules.  On Ubuntu, the “a2enmod” is an easy utility to do this.
  7. In my Apache setup, I added a new file called “elasticsearch” inside /etc/apache2/conf.d.  (Note the 13.10 doesn’t use a conf.d directory.   It could be added to the bottom of apach2.conf although I am sure there is a more “pristine” location.)  The contents are below.
    <IfModule proxy_module>
    <IfModule proxy_http_module>
    
    <Proxy *>
    <Limit GET > 
        allow from all 
    </Limit>
    
    <Limit POST PUT DELETE>
        order deny,allow 
        deny from all 
    </Limit>
    </Proxy>
    
    ProxyPreserveHost On
    ProxyRequests Off
    LogLevel debug
    ProxyPass /es http://localhost:9200/
    ProxyPassReverse /es http://localhost:9200/
    
    </IfModule>
    </IfModule>

    The application depends on the /es directory under web root. This can be changed along with the web pages that use it.

  8. Restart apache2.  (“sudo service apache2 restart”)
  9. Download the HTML and Javascript for the search pages from here:  Search HTML and Javascript.  It uses jQuery and jQueryUI and AJAX to perform the searching and suggestions.  Unzip and place in the web directory where you want it.  For me, I wanted a search subdirectory so I placed my in /var/www/search.
  10. So, the last thing is show how to index the files.  I am a fan of python so this is python code making http requests to elasticsearch adding the information.  The script below deletes the index, recreates, and starts adding content to it – from files in a directory.
    #! /usr/bin/python
    
    import httplib 
    import binascii
    import os
    
    HOST = 'localhost:9200'
    INDEX = '/basic'
    
    def connRequest(conn, verb, url, body = None):
        if body == None:
            conn.request(verb, url)
        else:
            conn.request(verb, url, body)
        return conn.getresponse().read()
    
    def connAddFile(conn, filename, rootFsDir, httpPrefix):
        with open(filename, 'rb') as f:
            base64Data = binascii.b2a_base64(f.read())[:-1]
    
        title = os.path.basename(filename)
        location = httpPrefix + filename[len(rootFsDir):]
    
        attachment = '{ "file":"' + base64Data + '", "title" : "' + title + '", "location" : "' + location + '" }'
        print connRequest(conn, 'POST', INDEX + '/attachment/', attachment)
    
    conn = httplib.HTTPConnection(HOST)
    
    print connRequest(conn, 'DELETE', INDEX)
    
    print connRequest(conn, 'PUT', INDEX, '{  "settings" : { "index" : { "number_of_shards" : 1, "number_of_replicas" : 0 }}}') 
    
    print connRequest(conn, 'GET', '/_cluster/health?wait_for_status=green&pretty=1&timeout=5s' )
    
    print connRequest(conn, 'PUT', INDEX + '/attachment/_mapping', '{  "attachment" : {   "properties" : {      "file" : {        "type" : "attachment",        "fields" : {          "title" : { "store" : "yes" },          "file" : { "term_vector":"with_positions_offsets", "store":"yes" }        }      }    }  }}' )
    
    # Add files here repeatedly
    rootFsDir = '/var/www/search/data/'
    searchDir = ''          # This is for recursion through the directories
    httpPrefix = 'data/'
    # Make this recursive some day
    for file in os.listdir(rootFsDir + searchDir):
        connAddFile(conn, rootFsDir + searchDir + file, rootFsDir, httpPrefix)
    
    print connRequest(conn, 'POST', '/_refresh')
  11. If you decide to get more creative and add only new files and delete the old ones, we need to understand how to get the list of existing files that are indexed.  Then you just have to correlate the current state of the files on disk with the index list.  This script gets the indexes and the files associated with them.
    #! /usr/bin/python
    
    import httplib 
    import json
    import sys
    import os
    
    import hostinfo
    
    argc = len(sys.argv)
    if argc != 2:
        print os.path.basename(sys.argv[0]), ""
        sys.exit(-1)
    
    indexFileName = sys.argv[1]
    
    def connRequest(conn, verb, url, body = None):
        if body == None:
            conn.request(verb, url)
        else:
            conn.request(verb, url, body)
        return conn.getresponse().read()
    
    conn = httplib.HTTPConnection(hostinfo.HOST)
    data = json.loads(connRequest(conn, 'GET', hostinfo.INDEX + '/_search?search_type=scan&scroll=10m&size=10', '{"query":{"match_all" :{}}, "fields":["location"]}' ))
    
    total = data["hits"]["total"]
    
    #scroll session id, used to request the next batch of data
    scrollId = data["_scroll_id"]
    counter = 0; 
    
    data = json.loads(connRequest(conn, 'GET', hostinfo.SITE + '/_search/scroll?scroll=10m', scrollId))
    
    f = open(indexFileName, 'w')
    
    while len(data["hits"]["hits"]) > 0:
        for item in data["hits"]["hits"]:
            f.write(item["fields"]["location"][0] + ',' + item["_id"] + '\n')
            f.flush()
    
        counter = counter + len(data["hits"]["hits"])
        print "Reading Index:", counter, "of", total
    
        scrollId = data["_scroll_id"]
        resp = connRequest(conn, 'GET', hostinfo.SITE + '/_search/scroll?scroll=10m', scrollId)
        #print resp
        data = json.loads(resp)
    
    f.close()
  12. To delete files, the python snippet looks like this where index is the id for the file we want indexing deleted for.
    def connDeleteFile(conn, index):
        print connRequest(conn, 'DELETE', hostinfo.INDEX + '/attachment/' + index)

So there we have it.  All we have to do figure out where we are getting our data from and copy it to the “data” directory.  One particular way I have done this is with rsync across an SMB share.

This by no means is meant to be a lesson on elasticsearch.  There can be some improvement here.

However, this is a quick way to set up searching documents for information you never knew existed.  (Side note:  I have had 10 ms search times across 2500 documents.)

 

Recycling a Third Party Application with System Tray Icon

I had a need to recycle a third party application that had a system tray icon.  The application controlled hardware  and would get into a funky state.

The application was titled the “user mode driver” but I’m not totally sure if it was the user mode driver framework that Microsoft touted with Vista.  The user mode driver (UMD) was really a bridge process between the Ethernet port and a COM (a.k.a. the older timer component object model) in-process DLL that resided in your program memory space.

The UMD also had a system tray component to it that needed a little cleanup when the application was killed.   The sytem tray icon was left behind.

This post is recycling other’s work that we will reference.  This post is about bringing it all together in C#.

There are three parts to this option.

  1. Stop the process
  2. Restart the process
  3. Clean up the system tray.

For this example though, we will assume that we know the full path to the process and that the process name is the base file name without extension.

Stop the Process

C# has a handy way to stop processes.

private void StopUserModeDriver(string userModeDriverPath)
{
  Process[] procs = null;

  try
  {
    procs = Process.GetProcessesByName(Path.GetFileNameWithoutExtension(userModeDriverPath));

    foreach (Process proc in procs)
    {
      proc.Kill();
      proc.WaitForExit(5000);
    }
  }
  finally
  {
    if (procs != null)
      foreach (Process proc in procs)
        proc.Dispose();
  }
}

Restart the Process

This one is simple.

private void StartUserModeDriver(string userModeDriverPath)
{
  Process.Start(userModeDriverPath);
}

Clean Up the System Tray

This code is present here and we will show it again on this post.

[StructLayout(LayoutKind.Sequential)]
public struct RECT
{
  public int left;
  public int top;
  public int right;
  public int bottom;
}
[DllImport("user32.dll")]
public static extern IntPtr FindWindow(string lpClassName, string lpWindowName);
[DllImport("user32.dll")]
public static extern IntPtr FindWindowEx(IntPtr hwndParent, IntPtr hwndChildAfter, string lpszClass, string lpszWindow);
[DllImport("user32.dll")]
public static extern bool GetClientRect(IntPtr hWnd, out RECT lpRect);
[DllImport("user32.dll")]
public static extern IntPtr SendMessage(IntPtr hWnd, uint msg, int wParam, int lParam);

private void RemoveOrphanedIconsFromSystemTray()
{
  IntPtr systemTrayContainerHandle = FindWindow("Shell_TrayWnd", null);
  IntPtr systemTrayHandle = FindWindowEx(systemTrayContainerHandle, IntPtr.Zero, "TrayNotifyWnd", null);
  IntPtr sysPagerHandle = FindWindowEx(systemTrayHandle, IntPtr.Zero, "SysPager", null);
  IntPtr notificationAreaHandle = FindWindowEx(sysPagerHandle, IntPtr.Zero, "ToolbarWindow32", "Notification Area");
  if (notificationAreaHandle == IntPtr.Zero)
  {
    notificationAreaHandle = FindWindowEx(sysPagerHandle, IntPtr.Zero, "ToolbarWindow32", "User Promoted Notification Area");
    IntPtr notifyIconOverflowWindowHandle = FindWindow("NotifyIconOverflowWindow", null);
    IntPtr overflowNotificationAreaHandle = FindWindowEx(notifyIconOverflowWindowHandle, IntPtr.Zero, "ToolbarWindow32", "Overflow Notification Area");
    RefreshSystemTrayArea(overflowNotificationAreaHandle);
  }
  RefreshSystemTrayArea(notificationAreaHandle);
}

private static void RefreshSystemTrayArea(IntPtr windowHandle)
{
  const uint wmMousemove = 0x0200;
  RECT rect;
  GetClientRect(windowHandle, out rect);
  for (var x = 0; x < rect.right; x += 5)
    for (var y = 0; y < rect.bottom; y += 5)
      SendMessage(windowHandle, wmMousemove, 0, (y << 16) + x);
}

Essentially, we are getting window handles to the notification area that is on your system tray and also the  overflow area introduced in Windows 7 (don’t know about Vista – does anyone remember Vista?).  That is the little arrow icon in the system tray that opens a little popup where all pestering but insignificant applications’ system tray icons live.

Do you remember how you have an orphaned system tray icon so you move your mouse over it to find magically disappears?  That is exactly what this code does.  With the Window handles to the system tray and overflow, we simply move our mouse repeatedly up and down and left to right.  We don’t actually move the cursor, just send the windows message.

There was another solution presented somewhere (on code project but I can’t find it now) that got information in the private bytes of the window allocations to determine if a process was still operating.  This approach was more pristine but did some memory allocation tricks in C# that made me nervous.  Sending mouse messages was certainly safer although not elegant.

C++ Speed Test with FPU and ints

I wanted to test the difference on modern hardware between floating point match and integer math. Here is my code (which was similar to C# code previously written).

// CSpeedTest.cpp : Defines the entry point for the console application.
//

#include "stdafx.h"
#include 
#include 
#include 

using namespace std;

#define FP_MULT
//#define FP_NEG
//#define INT_MULT
//#define INT_NEG

class StopWatch
{
    LARGE_INTEGER m_freq;
    LARGE_INTEGER m_startTime;
    LONGLONG m_totalTime;
public: 
    StopWatch() : m_totalTime(0L)
    {
        QueryPerformanceFrequency(&m_freq);
    }

    void Start()
    {
        QueryPerformanceCounter(&m_startTime);
    }

    void Stop()
    {
        LARGE_INTEGER stopTime;
        QueryPerformanceCounter(&stopTime);
        m_totalTime += (stopTime.QuadPart - m_startTime.QuadPart);
    }

    void Reset()
    {
        m_totalTime = 0L;
    }

    double ElapsedTime()
    {
        return (double)(m_totalTime) / (double)(m_freq.QuadPart);
    }
};

int _tmain(int argc, _TCHAR* argv[])
{
    #if defined(FP_MULT) || defined(FP_NEG)
    volatile double poo = 0.0;
    #endif
    #if defined(INT_MULT) || defined(INT_NEG)
    volatile int poo = 0;
    #endif

    StopWatch stopWatch;
    for (int idx = 0; idx < 1000000000; idx++)
    {
      stopWatch.Start();
      #if defined(FP_MULT)
        poo = -1.0 * poo;
      #endif
      #if defined(FP_NEG) || defined(INT_NEG)
        poo = -poo;
      #endif
      #if defined(INT_MULT)
        poo = -1 * poo;
      #endif
      stopWatch.Stop();
    }

    double elapsedTime = stopWatch.ElapsedTime();

    int minutes = elapsedTime / 60;
    int seconds = (int) (elapsedTime) % 60;
    int ms10 = (elapsedTime - int(elapsedTime)) * 100;

    cout << setfill('0') << setw(2) << minutes << ':' << seconds << ':' << ms10 << endl;;

    return 0;
}

The code was compiled as a console application for Win32 Debug so the variables would get “registered”.

The test machine is a Dell Precision M4800. The process is an Intel Core i7-4800MQ CPU at 2.70 GHz with 16GB ram. The OS is Windows 7 Professional 64 bit with SP1.

Here is the results. I have also included the assembler for the operation under test.

define time assembly
FP_MULT 7.32s fld qword ptr [__real@bff0000000000000 (0BE7938h)]; fmul qword ptr [poo]; fstp qword ptr [poo]
FP_NEG 7.56s fld qword ptr [poo]; fchs; fstp qword ptr [poo]
INT_MULT 7.58s mov eax,dword ptr [poo]; imul eax,eax,0FFFFFFFFh; mov dword ptr [poo],eax
INT_NEG 7.59s mov eax,dword ptr [poo]; neg eax; mov dword ptr [poo],eax

I actually don’t believe I have accomplished too much as the setup to call the timing functions actually take many, many more opcodes.  However, this was an interesting experiment and I do now have a cool C++ stopwatch on Windows for more extensive testing on much larger blocks of test code.

Intelligent Agent #2
(ninja turtles)

As promised, here is a super simple example of an Intelligent Agent program using NetLogo.

The goal is to find the boundaries within an image. There are powerful algorithms for locating boundaries, but I wanted to do it another way, I wanted to see if Intelligent Agents could be used. As I said in my earlier post, NetLogo provides a rapid prototyping means for creating Agents and offers a visual development and execution environment. Below is the image I started with. Note the low resolution. NetLogo could handle a much higher resolution image, but the image would be too large to fit in this post, so I had to lower the resolution so I could display it here. (now someone who is an expert in NetLogo might point me to a setting to make the ‘patches’ smaller, that would help).

Screen shot before Agent run.
Contour1

Screen shot after Agents have run a bit.
Contour2

As you can see, the Agents did a fair job of locating the color contours, and thus the edges.
My approach is bone headed simple. Agents wander around randomly keeping track of the color of recently visited patches. If the new patch is enough different from the prior patch, then it marks the patch White to indicate a boundary.
The approach has a serious flaw, and a weakness. The flaw is that the boundary is dependent on the direction of the Agent at the time it encounters a boundary. For example, an Agent moving from Blue to Red will mark the Red patch, while a neighboring Agent moving from Red to Blue will mark the Blue patch. This causes the edge to be much more ragged than it really is. This could be solved by adding a rule such as always mark the lighter colored patch.
The weakness is the algorithm could be much more efficient. Rather than wander aimlessly, the Agent could try to determine the direction of the edge and follow it. If two Agents collide, one could jump to a new location to let the other complete the boundary.

I am sure you are eager to see some code. OK, here goes:
extensions [bitmap ]

turtles-own [last-color second-last-color hunt-color]

to setup
let img bitmap:import "C:\\Temp\\Desert.jpg"
;;set img bitmap:to-grayscale img
bitmap:copy-to-pcolors img false

create-turtles 10 [fd 10]
ask turtles [
set last-color pcolor
set second-last-color pcolor
set hunt-color pcolor
]
end

to hunt
ask turtles[
rt random 50
lt random 50
fd 1
if pcolor = 0
[jump-elsewhere]
set second-last-color last-color
set last-color pcolor
if isContour second-last-color last-color
[set pcolor [255 255 255]]
]
hunt

end

to-report isContour [color1 color2]
let retVal false
foreach [0 1 2]
[
;; show item ?1 color1
if abs (item ?1 color1 - item ?1 color2) > 60 ;; or item ?1 color2 - item ?1 color1 > 60
[set retVal true]
]
if approximate-rgb item 0 color1 item 1 color1 item 2 color1 = white or approximate-rgb item 0 color2 item 1 color2 item 2 color2 = white
[set retVal false]
report retVal
end

to jump-elsewhere
set xcor random 40
set ycor random 40
set last-color pcolor
set second-last-color pcolor
set hunt-color pcolor
end