Wednesday, December 07, 2016

Interoperability of AI assistants..


I don't think that it'd be presumptuous of me to assume that most of you would have used an AI assistant of some sort; be it SIRI on iPhone/MacBooks, Google Now on Android, Cortana on Microsoft Windows or Alexa on Amazon Echo. Even if you used the "speak to type" option on your phone, you have used it.

Most of the AI assistants have limited functionality and do certain things quite well. I don't care if Alexa is as smart as Google Assistant. It makes up for that with an extensive skill/rule set. Google Assistant would shine in situations where the queries are more free form. SIRI/Cortana/Google Assistant/Alexa all depend on the information you are sharing with them. This inherently limits the efficacy of each. Google Express is not going to replace Amazon for my shopping needs, Microsoft Live mail is not going to replace Gmail and I'm not going to exchange my Macbook or Windows desktop for a Chromebook. So what is the solution?!

The browser wars in the 90s and early 2000s have shown us how this is going to go down. Couldn't the industry leaders pool in their resources to come up with something like a W3C standard and provide a standard browser like shim layer (Reactor/Proactor design patterns) to dispatch requests to individual subsystems (AI assistants)? This could be a simple rule engine in the first phase, that punts my email queries to Google Assistant, my shopping inquiries to Alexa, and my file related queries to Siri/Cortana etc.

Taking the browser parlance a bit far, I would want to theme my AI assistants too. I don't want to call them Google/Cortana/Alexa, I want to call them Mr Chekov or Mr Sulu and I want them to refer to me as Captain and always with a sense of impending doom (Star Trek joke for the uninitiated).

A unified AI assistant might just be a pipe dream. Hopefully, it doesn't take the AI assistants a decade to mature (like it did for the browsers).

Wednesday, May 18, 2016

Takeaways from RedisConf

I have been using Redis as a swiss army knife database, both at work, and for hobby projects. So I decided to attend RedisConf this year. Low cost of entry, more tech talks than marketing spiels, and a chance to meet Salvatore Sanfilippo, the creator of Redis, were the big draw to the event.
The event was a great introduction to upcoming and lesser used redis features (used by me that is). It was a fun and eye opening window into a lot of different domains and use cases. When all you sell is cheese, the concept of sour cream is interesting..
Some that blew my mind were:

  1. IoT: Prior to this conference, I had a limited understanding of it's use-cases. IoT, to me, was just using a web-controlled device. A light-bulb, or a smart fridge, that'd ping me if the milk ran out etc. I.e. a very Silicon Valley centric first world problem solver.

    The use-cases that Laura Merling provided in her talk, just blew me away:
    • Disney MagicBands: When you navigate your way through disney parks, your hotel automatically knows that you are coming, your door automatically unlocks & most importantly, if you get dangerously close to a moving ride, the ride adjusts itself (can stop) to get you out of harm's way!
    • The oil & gas industry: The oil rigs generate a lot of real time information on the equipment’s performance and health — pressure, temperature, flow rates and multitudes of other variables. IoT based solutions provide real time analytics for predictive and preventative measures. This link explains this further.
    • Factory safety: Remember the incident, when an automotive factory robot arm crushed a factory operator into a metal plate. This happened because the robot arm mistook the person as another part of the machinery. There have been around 26 deaths in the past 30 years in US alone. This number, although low, can be further brought down by judicial use of IoT tech (just like Disney's magicBand).
    • Home Automation: Home automation industry is fairly nascent right now. There is an app to control the lights, another one to control your locks, another one for your TV and so on. There needs to be push from the consumers for tighter integration of products for arbitrary control. E.g. to turn down the volume of the music system if someone is sleeping in the other room, or turn on the lights when the doors are unlocked, and so on.

  2. Distributed systems & the rise of DevOps: There were couple of talks by companies like Netflix and Scopely. The amount of data served and generated by companies these days is monumental.

    Avram Lyon from Scopely gave a behind the scenes look at the data usage. He talked about a simple mobile game where the goal is to hit oncoming zombies with some objects. His game has a modest 100,000 users active at any given time. To generate insights into the game (E.g. People in California like to throw rocks at zombies, whereas people in New York like to hurl trash-cans at them, or that no-one seems to like throwing hammer at zombies), the amount of data generated and recorded is huge. This is used to provide the development and marketing teams with actionable intelligence. Few TBs of data is not huge by today's standards, and conventional databases are ill-suited for this endeavor. Redis, along with it's other supporting actors lends well to this use-case.

    The conference was filled with 2-4 member teams who had set up massively scaled systems within 5-6 months using FOSS(kafka, spark, redis etc) on el-cheapo AWS cluster. Horizontal automatic scaling being the key takeaway. One doesn't need to be a star programmer to build these systems. The architecture has matured enough that systems can be deployed with minimal effort (relatively).

    Another emergent database concept was Eventual consistency. Unlike traditional database systems, that dictate that the view of the databases need to be the same for everyone, eventually consistent databases are like.. meh, it's not the end of the world if are able to access the new season of "House of Cards" few minutes after someone else. Heuristics are further tied to extrapolate behavior of the system to get massive scalability. Amazon, Netflix etc have this kind of systems. Take a look at this to get more detailed view.

  3. Hardware advancement: Traditionally, Redis has been used as a memcached replacement. Something that sits in-front of traditional sql databases. Apps like Twitter et al would store the hot keys in a massively distributed redis cluster (in order of TBs) and would later save these to some backend stores. This adds another layer of complexity. With the advent of flash memory, the backend stores typically now sit on flash memory. However, the newer NVDIMMs & NVMs perform orders of magnitude better than flash memory at just 1/10th the price of RAM. With the closing gap between RAM & Flash, the functionality and scalability of Redis is increasing by leaps and bounds.

  4. Redis Modules: Usual databases are just dumb datastores. Their main utility is to store the data efficiently in the memory and provide an easy way to retrieve it. All the intelligence to consume that data goes into the client code. Salvatore announced that starting 4.0, Redis would support loadable modules that can be used to offload client functionality to Redis. You could use this to keep track of counter rates or a running average or use simple ML algos to automatically classify the images being saved on the DB or apply some sort of filter on them. The possibilities are endless.
Smaller advancements in hardware typically bring in much bigger change in software, that exponentially and radically changes the application development paradigm.

About a decade ago, a simple classifier on an MPI Beowulf cluster took few weeks to implement. These days, I could use a python ML classifier to do the same thing in just few hours.

As complicated technology becomes more pervasive, the entry points to the tech itself gets easier, which further paves way for more complicated technology. Redis has already enabled a lot of industries, which are further revolutionizing everything from healthcare to robotics.
We live in interesting times.


Saturday, March 05, 2016

Programming and the art of shoe polishing

One of the most most undocumented occupational hazards of being a software engineer is that the rest of the world is not privy to our thought process, nor do they have a deeper understanding of what we do. Yes, coding is a part of it, and a very big part of it, but, it is understanding of the system that separates good engineers from the bad ones. Coding is easy, coding a system is hard. There is a whole lot of difference between learning english and writing a novel. Just as an exercise, let's imagine that the task at hand is polishing shoes. Our protagonist is the engineer(SE). The product manager(PM), the quality assurance(QA) engineer & shoe polisher(SP) have the supporting roles in this play.

Bad Engineer's Life
PM: I want our company to polish shoes. We will provide supplies to our shoe polisher team who will polish the shoes.
SE: I'll get on it.

[CODE]
- Distribute shoes to each SP(shoe polisher).
- Distribute polish to each SP.
- Ask each SP to use the given polish on each shoe.
- Take shoes from them.


QA: Hey! Your team used brown polish on black shoes. I'm filing a bug.
SE: Oops. I didn't know that. Let me fix it.

[CODE]
- Distribute shoes to each SP(shoe polisher).
- Each shoe has a COLOR.
- Distribute polish to each SP. Each polish has a COLOR.
- Ask each SP to use the given polish(COLOR) on each shoe(COLOR).
- Take shoes from them.


QA: Hey! Your team put black leather polish on my black canvas shoes. I'm filing a bug.
SE: Oops. Hey, why would the customer even want to polish canvas shoes?! Please test my system  properly.
QA: I don't care. You should not ruin your customer's shoes.
SE: Let me talk to PM.

PM: You should not ruin your customer's shoes.
SE: Ok.

[CODE]
- Distribute shoes to each SP(shoe polisher).
- If shoe is CANVAS then return to the customer.
- Each shoe has a COLOR.
- Distribute polish to each SP.
- Each polish has a COLOR.
- Ask each SP to use the given polish(COLOR) on each shoe(COLOR).
- Take shoes from them.


QA: Hey! SP still polishes my hip hemp shoes from SF. And why should the customer wait for shoe distribution to get his canvas shoe back? I'm filing a bug. Hey PM, the code is unstable!!
PM: Hmmm :(
SE: Let me fix it.

[CODE]
- Get shoes from Customer.
- Return shoe if NOT leather shoes.
- Distribute shoes to each SP(shoe polisher).
- Each shoe has a COLOR.
- Distribute polish to each SP.
- Each polish has a COLOR.
- Ask each SP to use the given polish(COLOR) on each shoe(COLOR).
- Take shoes from them.


QA: After 1 work day, the shoes don't look polished. Something is not working.
SE: I don't know, it works for me. Can you try it again and let me know?
-- After one day --
QA: See.. I told you so. I'm filing a bug.
SE: Oops.. the polish got over after a day. Let me add more polish each day.

[CODE]
- Get shoes from Customer.
- Return shoe if NOT leather shoes.
- Distribute shoes to each SP(shoe polisher).
- Each shoe has a COLOR.
- Distribute polish to each SP.
- Each polish has a COLOR.
- Ask each SP to use the given polish(COLOR) on each shoe(COLOR).
- If SP is working more than a day, then distribute more supplies to each SP.
- Take shoes from them.


QA: The product passes the test cases.
PM: We can sell this code to SUPER-MEGA-STORE then. Let us do some scale testing. They order around 100 shoe polishes a day! This could be a big order for us. I feel like Steve Jobs.
QA: Sure, whatever, I'll test out the cases.

QA: PM, we polish 100 shoes in a week! We need to handle that many in a day!
PM: SE.. make it so!
SE: I'll try. We can handle around 15 shoes in a day, we need to get 7 times as many SPs, maybe 6 times if we get stronger and faster SPs.
PM: Hmmm... We'll have to pass the cost to SUPER-MEGA-STORE

SUPER-MEGA-STORE: We like the fonts on sign board outside your establishment, take our money                                           and give us the product please.
--- After Deployment ---

SUPER-MEGA-STORE: Hey, nobody wears black leather shoes in our Sunnyvale store. We have too many unused black polish cans in our stores right now.
SE: I'll fix it.
...
SUPER-MEGA-STORE: Hey, one of the SP died while working. The shoes are now piling on!
SE: I'll fix it.

[CODE]
- Get shoes from Customer.
- Return shoe if NOT leather shoes.
- Distribute shoes to each SP(shoe polisher).
- Each shoe has a COLOR.
- Distribute polish to each SP.
- Each polish has a COLOR.
- Ask each SP to use the given polish(COLOR) on each shoe(COLOR).
- If SP is working more than a day.
- If polish(COLOR) is over, replace it.
- If SP is dead, get new SP.
- Take shoes from them.


SUPER-MEGA-STORE: Hey, we want canvas shoes to be washed too.
SE quits.

This happens over the period of a year. SE's own unscalable and spaghetti code have tormented him. The fluidity of requirements from the customer adds fuel to the fire. When one SE quits, other SE takes his place, mutating this code even more into an unrecognizable mess.

Now let's see what happens with a more experienced engineer:

Good Engineer's Life
PM: I want our company to polish shoes. We will provide supplies to our shoe polisher team who will polish the shoes.
SE: I'll get on it. What kind of shoes though?
PM: Leather shoes.
SE: Any specific set of colors? Google lists some common colors as BLACK, BROWN, NEUTRAL, WHITE, CHERRY RED & CORDOVAN.
PM: Hmmm..
SE: Also what about speciality leather shoes like those make with crocodile leather?
PM: Market research shows that people mostly wear BLACK and BROWN shoes, so let's stick with that.
SE: Any particular brand of polish you want me to use.
PM: Use the CODEYMAN brand. We can get bulk discount from them.
-- A week later --
SE: Hey I found out that one box of polish can polish about 250 shoes. I also spoke to my SP friend. He can shine 10 shoes in a day. Do you know about how many shoes we'll be shining a day?
PM: You ask too many questions, I somewhat hate you. SUPER-MEGA-STORE might be interested, but that is just a rumor I heard. But you haven't yet started writing code.
SE: So how many shoe shines a day?
PM: About 100 maybe?
SE: So we need about 10 SPs working each day. We can have 20 SPs who work on two rotating shifts. This would reduce the downtime if something were to happen to them.
PM: Stop doing a PhD in this! I want the code.
SE: Also 100 shoes a day is 700 shoes a week, which means 3 shoe cans would be used in a week. We need to keep more inventory of cans, to keep SPs from waiting on a polish can if someone else is using it. You can use this to set a price point for the product.
PM: I hate you! Give me an ETA.
SE: I'll sketch some informal design plans and revert in 2-3 days.

.. Time goes on ..
SE: PM, I'm going to be code complete in some time. I want the QA to add test cases.
PM: So the product is done?
SE: No, I've just written the code and did some unit testing. System testing etc will take at least a  month.
PM: So the product is done? I'll tell the Sales team to sell the product.
QA: I'll start testing it as soon as I'm done testing some other thing the SE doesn't care about.

SE: Here's the code

[CODE]
- Don't allow customers without BLACK or BROWN leather shoes.
- Let there be X BLACK polish cans and Y BROWN polish cans.
- Every 12 hours:
    a. New set of 10 SPs come in.
       - If less than 10 SPs come in. Pay the exiting employees overtime to pick up the slack.
b. SPs distribute the shoes amongst themselves, trying to maximize the amount of shoes of a particular COLOR.
c. Each SP takes the cans of polish corresponding to the shoe colors he has. He decrements the number of shoe polish cans available by 1.
- If the number of shoe polish hits the threshold T, he puts an order into the system to get more polish. This way they don't have to wait for polish if it runs out.
d. If SP runs out of shoes to polish:
- He'll put the polished shoes on the rack.
- Hel'll repeat steps from b. again.
e. When SPs are ready to leave:
- If less than 10 SPs leave then:
  - Find if SP is dead and put a replacement SP for the next rotated shift.
  - If SP is hiding, then ask him to leave.
- Take shoes from them.


PM: (During review): Please use spaces instead of tabs and use the function names according to  coding standards.

QA: Hey, edge condition issues. If I'm getting 100 shoes/day and one SP dies, the backlog never goes away.
SE: Oh yeah. Let's add a fudge factor of 11 SPs. Will take care in release.

PM: How is the quality?
QA: Seems to mostly work. Not many open issues I think.
PM: Great, let's deploy it.

SUPER-MEGA-STORE: Hey, your system doesn't handle bursts of more than 110 shoe-shines. We need a burst of 200 during Christmas holidays.
PM: Sure. We can include redundancy and burstiness of shoe-shine traffic in your tarrif. This would translate to hiring seasonal workers.

SUPER-MEGA-STORE: I need to get canvas shoes washed too.
PM: That feature will be supported in the next release.

SE: Shouldn't be a big deal. I'll design similar product for canvas-shoes and encapsulate both the products behind an api that is multiplexed by shoe type.
PM: Whatever.. just do it.

SE goes home.
SE: Honey.. I'm home.
Wife: Whatever! you spend all your time at work. I hate you.

.. SE is sad.