A Community is Defined By Its Center and Not Its Periphery

“My sense is that you need to build up a nucleus of people who know each other and who can network and support each other [in developing a proficiency in a technology with the complexity of R.]”

—Hadley Wickham in an interview with Dan Kopf, “What’s next for the popular programming language R?Quartz. August 17, 2019.

Made me think of a Larry Wall Slashdot interview, question 7, from back in the day.

Build Your Own Text Editor

“This is an instruction booklet that shows you how to build a text editor in C.

The text editor is antirez’s kilo, with some changes. It’s about 1000 lines of C in a single file with no dependencies, and it implements all the basic features you expect in a minimal editor, as well as syntax highlighting and a search feature.

This booklet walks you through building the editor in 184 steps. Each step, you’ll add, change, or remove a few lines of code. Most steps, you’ll be able to observe the changes you made by compiling and running the program immediately afterwards.”

https://viewsourcecode.org/snaptoken/kilo/index.html

Might be interesting to compare to the source code of mg, which is a minimal editor written in C that works like Emacs.

Steam Game: TIS-100

TIS-100 is the Guitar Hero of high-level assembly language programming. A restrictive set of commands, storage capabilities and other constraints that make this very challenging. My favorite review:

“A great game to play for when I want to feel stupid.”

If you are a programmer, nerd or enjoy logic puzzles with not much hand holding, this game is for you. Recommended.

Learn to Program With Common Lisp

Tim Ferriss has a currently popular blog post, “Ten Lessons I Learned While Teaching Myself to Code,” that I’ve seen mentioned in a few places. While it is largely good advice, there is one point that is wrong. It does matter what language you learn. Here’s the ten lessons from his article:

  1. The online world is your friend, start there.
  2. Don’t stress over what language to pick.
  3. Code every day.
  4. Automate your life.
  5. Prepare for constant, grinding frustration.
  6. Build things. Build lots of things.
  7. “View Source”: Take other people’s code, pick it apart, and reuse it.
  8. Build things for you—code you need and want.
  9. Learn how to learn.
  10. Reach out to other coders.

The programming language you choose matters. If what you are doing is trivial, then yes, you can use any programming language. To quote from Paul Graham’s essay, “Revenge of the Nerds“:

“The disadvantage of believing that all programming languages are equivalent is that it’s not true. But the advantage is that it makes your life a lot simpler. And I think that’s the main reason the idea is so widespread. It is a comfortable idea…There are, of course, projects where the choice of programming language doesn’t matter much. As a rule, the more demanding the application, the more leverage you get from using a powerful language. But plenty of projects are not demanding at all. Most programming probably consists of writing little glue programs, and for little glue programs you can use any language that you’re already familiar with and that has good libraries for whatever you need to do. If you just need to feed data from one Windows app to another, sure, use Visual Basic.”

Tim Ferriss is writing trivial programs. So, for his use case, the choice of language is irrelevant. It might be for your use case as well.

But, computer languages are not the same. They have different strengths and weaknesses. For example, in this blog post, a professional programmer discusses why Rust is not a good replacement for C. The short version, Rust is a young language that isn’t stable, and it lacks features older languages have, such as a specification.

Some languages are simply more powerful, mature and give you more options. If you are going to go through the trouble to learning how to program, why not ground yourself in a language with more capabilities?

Your first choice of programming language is going to shape how you think about programming. It can take a long time to broaden your sense of the possible if you pick a language with limited features when you first start learning.

There are many good, powerful programming languages. Arguments can be made for any language you like. However, Paul Graham’s article mentioned above makes a good case that Lisp is a very powerful language. In Pascal Costanza’s Highly Opinionated Guide to Lisp, there’s an interesting observation:

…”Lisp is, in some sense, the mother of all [computer programming] languages…the mindset of Lisp asserts that expressive power is the single most important property of a programming language. Nothing should get in your way when you want to use that power. A programming language should not impose its view of the world on the programmer. It is the programmer who should be able to adapt the language to his/her needs, and not the other way around.”

Lisp is old. It’s stable. It is powerful. Textbooks teaching it date back to the early 1990s. But, they are still relevant and can be bought for almost nothing. So, why not learn Common Lisp? It’s a good question. The most common answer is that computer programming languages are subject to fads and new languages are more popular. But, given Lisp’s flexibility, it’s difficult to make the case that they are better.

Ok, suppose for a moment I’ve convinced you. Now, the question is: how do you go about learning Common Lisp? Last year, Steve Losh provided an answer. He put together a blog post called “A Road to Common Lisp,” that explains the Lisp language in detail and how he went about learning it. In short:

  1. Install SBCL or if you’re using MacOS and want a single GUI app you can download from the App Store, choose Clozure CL. My preference is for Emacs, SBCL, quicklisp and SLIME (tutorial), but you should use an editor that is comfortable to you that can balance parentheses, highlight comments and strings, and autoindent Lisp code. Atom can be a good choice if you haven’t use a text editor before. But, it would be better if you learned either Emacs or Vim.
  2. Read Common Lisp: A Gentle Introduction to Symbolic Computation (<$2 on Alibris, online for free). Do all the exercises and grok in fullness. This book is aimed at intelligent beginners with no prior programming knowledge. Take your time and noodle with this text. It’s going to take time to develop a fluency, and it’s a different style of thinking.
  3. Then read Practical Common Lisp (<$25 on Alibris, online for free). Work through some of this book every day until you complete it. Type in the code if it is going to help you understand it. It doesn’t have exercises.
  4. Write some code for something easy, e.g., Project Euler.
  5. Then, flesh out your understanding further with Paradigms of Artificial Intelligence Programming (<$25 on Alibris).
  6. Now, you’re ready to program something serious and need a weighty reference text, i.e., Common Lisp Recipes (<$50 on Alibris).
  7. Finally, I suspect that Patterns of Software: Tales from the Software Community (<$10 on Alibris) is recommended because it’s a help with understanding how to work on software projects, but the purpose of the recommendation is kept unclear to prevent spoilers.

This strikes me as a better alternative than a $11,000 boot camp. Depending on how much you want to learn, you can get the tools you need to learn Common Lisp to the level you want for somewhere between $0-$150. All the software is available at no cost, and it will run on any computer you already own. Then, if you want to branch out and learn Python, Javascript, or some other language, you’ll have an excellent foundation. However, if you try learning Python or Javascript and then try to learn Lisp or another more powerful language, you’ll find it to be a much more difficult task.

Good luck!

Two Computing Revolutions, Exhibit R: Manipulating & Visualizing MASIE Sea Ice Data

Back in April 2018, I mentioned the idea of two computing revolutions:

“There are two computer revolutions. One revolution is trying to abstract out the technology and present people with an easy, touch interface to accomplish specific tasks. Using your phone to take a picture, send a text message, post to social media, play YouTube videos, etc. are all examples of this type of technology. It’s probably the dominate form of computing now.

The other revolution are the complex computing tools that are being developed that cannot be used via a touch interface.”

The programming language R is an example of the second type of revolution. One simple task it can perform is reformatting data. It can take a long column of numbers from a source such as the MASIE sea ice data that is essentially unintelligible to people, and it can change it into a form that makes for easy comparison across years by day for a single sea.

Original data
Reformatted data facilitating comparison of a single day across years.

While this data manipulation is a powerful tool, this is only the tip of the iceberg. The real power comes from the kind of computing and visual representation that comes after the data has be reorganized. For example, once we manipulate the data into the above format, we can then create a correlation matrix that shows which years are closest in any given year.

And this chart is generated by five lines of code, two of which create a pdf:

sea_rcorr <- rcorr(as.matrix(sea_table[, -c(1)]))
sea_coeff <- sea_rcorr$r
pdf(paste0("./output/sea-ice-in-", sea, "-correlation-", sea_date, ".pdf"))
corrplot(sea_coeff, method="pie", type="lower")
dev.off()

We can then build on this step and take the correlations between the five years closest to the current year and create a chart looking at a specific period of days, like so:

Looking at this chart, we can make a pretty accurate guess as to what the extent of sea ice will be on Day 100 because we can see that the years 2014, 2015 and 2016 are the closest years for comparison and we get easily see by how much.

R is a really powerful tool, particularly if you have to do repeated calculations on data that frequently updates and you need to present it in a format that helps decision making. It is far superior than anything I’ve ever done using spreadsheets. But, it does take time to set-up initially, and it is difficult for individuals to develop the expertise to do it effectively if they are not part of a team, like the one described in the article I posted earlier today: How the BBC Visual and Data Journalism Team Works With Graphics in R. Still, this is a much better tool for certain kinds of problems that is worth looking into if you find yourself looking at complex data to make decisions.

Try It Yourself

Download RStudio for the operating system you use. Take the R script I used to manipulate the data and generate the charts above, copied below. You should be able to cut and paste the script over into the Source pane in RStudio. You’ll also need to install the relevant libraries over in the Packages pane on the lower right side. Then, since the script is written as a function, you’ll also need to call the function in the Console pane – the box in the lower, left corner – once you have loaded it with something like:

> sea-ice(sea="Greenland_Sea", day_one=1, last_day=365)

Just the relatively easy exercise of getting this to work could serve as a starting place to get a sense of how R works and how you might incorporate it into your workflow. It’s worth giving a try.

Note 1: You will need to replacing Greenland Sea with the area of interest to you, i.e., “Northern_Hemisphere”, “Beaufort_Sea”, “Chukchi_Sea”, “East_Siberian_Sea”, “Laptev_Sea”, “Kara_Sea”, “Barents_Sea”, “Baffin_Bay_Gulf_St._Lawrence”, “Canadian_Archipelago”, “Hudson_Bay”, “Central_Arctic”, “Bering_Sea”, “Baltic_Sea”, “Sea_of_Okhotsk”, “Yellow_Sea”, or “Cook_Inlet”.

Note 2: This was my first serious attempt to write anything useful in R. I have some minor experience writing in Perl, Python, and a few other computer programming languages that helped make this easier to do. Still, it is worth noting I’m not a programmer. Writing programs in R is a skill that can be learned by many people and be useful to some degree to anyone.

# sea-ice.R 
# Original: November 2, 2018
# Last revised: December 3, 2018

#################################################
# Description: Pulls MASIE ice data, limits by sea, 
# and runs a correlation matrix to determine which 
# five years have trend lines that are closest to
# current, and presents recent data in a graph.  
# Best use is in generating trend lines and being 
# able to visually compare current year with 
# previous years.
#
# Note: The prediction algorithm, ARIMA, is 
# commented out because it is of questionable 
# utility and makes this script run much slower.
# I left it in as a bad example for people 
# that want to try machine prediction.

rm(list=ls()) # Clear memory
gc()

# Set Variables in Function
# "Northern_Hemisphere", "Beaufort_Sea", "Chukchi_Sea",
# "East_Siberian_Sea", "Laptev_Sea", "Kara_Sea",
# "Barents_Sea", "Greenland_Sea", 
# "Baffin_Bay_Gulf_St._Lawrence", "Canadian_Archipelago", 
# "Hudson_Bay", "Central_Arctic", "Bering_Sea",
# "Baltic_Sea", "Sea_of_Okhotsk", "Yellow_Sea", 
# "Cook_Inlet"

# Replace defaults in function to desired, 
# or call the function from console
sea-ice <- function(sea="Greenland_Sea", day_one=300, last_day=365) {
  days_start <- day_one
  days_end <- last_day
  sea_of_interest <- sea
  sea_date <- Sys.Date()

  #################################################
  # Preliminaries

  # Define basepath and set working directory:
  basepath = "~/Documents/programs/R/forecasting"
  setwd(basepath)

  # Preventing scientific notation in graphs
  options(scipen=999)

  #################################################
  # Load libraries. If library X is not installed
  # you can install it with this command at the R prompt:
  # install.packages('X') 
  library(data.table)
  library(lubridate)
  library(stringr)
  library(ggplot2)
  library(dplyr)
  library(tidyr)
  library(reshape2)
  library(corrplot)
  library(hydroGOF)
  library(Hmisc)
  library(forecast)
  library(tseries)

  #################################################
  # Import, organize and output csv data

  # Testing
  # masie <- read.csv("./data/masie_4km_allyears_extent_sqkm.csv", skip=1, header=TRUE)

  # Import csv using current masie data
  masie <- read.csv("ftp://sidads.colorado.edu/DATASETS/NOAA/G02186/masie_4km_allyears_extent_sqkm.csv", skip=1, header=TRUE)

  # Assign column names, gets rid of (X) in column names
  colnames(masie) <- c("yyyyddd", "Northern_Hemisphere", "Beaufort_Sea", "Chukchi_Sea", 
              "East_Siberian_Sea", "Laptev_Sea", "Kara_Sea", "Barents_Sea", "Greenland_Sea",
              "Baffin_Bay_Gulf_St._Lawrence", "Canadian_Archipelago", "Hudson_Bay", 
              "Central_Arctic", "Bering_Sea", "Baltic_Sea", "Sea_of_Okhotsk", "Yellow_Sea", 
              "Cook_Inlet") 

  # Separating day and year
  masie_tmp <- extract(masie, yyyyddd, into = c("Year", "Day"), "(.{4})(.{3})", remove=FALSE)

  # Selecting the columns of interest
  sea_select <- c("Year", "Day", sea_of_interest)

  # Pulling column data from masie and putting it in the data frame
  sea_area <- masie_tmp[sea_select] 

  # Adding column names, changing sea name to sea ice
  colnames(sea_area) <- c("Year", "Day", "Sea_Ice")

  # Reshape the three columns into a table, fix names, and export csv file
  sea_table <- reshape(sea_area, direction="wide", idvar="Day", timevar="Year")
  names(sea_table) <- gsub("Sea_Ice.", "", names(sea_table))

  # Creating a cvs file of changed data
  write.csv(sea_table, file=paste0("./output/sea-ice-in-", sea, "-table-", sea_date, ".csv")) 

  #################################################
  # Correlation Matrix
  sea_rcorr <- rcorr(as.matrix(sea_table[, -c(1)]))
  sea_coeff <- sea_rcorr$r

  pdf(paste0("./output/sea-ice-in-", sea, "-correlation-", sea_date, ".pdf"))
  corrplot(sea_coeff, method="pie", type="lower")
  dev.off()
  
  # Takes current year, sorts the matches, take the year column names,
  # and then assigns them values for the graph
  sea_matches <- sea_coeff[,ncol(sea_coeff)]
  sea_matches <- sort(sea_matches, decreasing = TRUE)
  sea_matches <- names(sea_matches)
  current_year <- as.numeric(sea_matches[1])
  matching_year1 <- as.numeric(sea_matches[2])
  matching_year2 <- as.numeric(sea_matches[3])
  matching_year3 <- as.numeric(sea_matches[4])
  matching_year4 <- as.numeric(sea_matches[5])
  matching_year5 <- as.numeric(sea_matches[6])

  #################################################
  # Prediction using ARIMA
  #
  # Note: Of questionable utility, but included as a curiosity.
  # Other possibilities for prediction: Quadradic Regression and 
  # Support Vector Regression (SVR). SVR examples are easy to find with Google. 
  # For R script for quadradic prediction for sea ice, see:
  # http://ceadserv1.nku.edu/data/Rweb/RClimate/#Ocean%20Ice:

  # sea_data <- sea_area # Importing initial MASIE area data

  # if (front_year == "TRUE") {
  #  sea_data <- filter(sea_data, Day <= 200)
  # } else {
  #  sea_data <- filter(sea_data, Day >= 165)
  # }

  # sea_data <- sea_data[, -c(1,2)] # Remove Year & Day
  # sea_ts <- ts(sea_data, start = 1, frequency=200) # ARIMA needs time series
  # sea_data <- as.vector(sea_data)
  # sea_fit <- auto.arima(sea_ts, D=1) 
  # sea_pred <- forecast(sea_fit, h=200, robust=TRUE)
  # sea_plot <- as.data.frame(Predicted_Mean <- (sea_pred$mean))
  # sea_plot$Day <- as.numeric(1:nrow(sea_plot))

  # ggplot() + 
  #  ggtitle(paste("Predicted Sea Ice Extent")) +
  #  geom_line(data = sea_plot, aes(x = Day, y = Predicted_Mean)) + 
  #  scale_x_continuous()
  
  #################################################
  # Historic Trends Graph
  sea_graph <- sea_area

  # Filtering results for 5 year comparison, against 5 closest correlated years 
  sea_graph <- filter(sea_graph, Year == current_year | Year == matching_year1 |
                      Year == matching_year2 | Year == matching_year3 | 
                      Year ==matching_year4 | Year == matching_year5)

  # These variables need to be numerical
  sea_graph$Day <- as.numeric(sea_graph$Day)
  sea_graph$Sea_Ice <- as.numeric(sea_graph$Sea_Ice)

  # Filtering results into period of interest
  sea_graph <- filter(sea_graph, Day <= days_end)  
  sea_graph <- filter(sea_graph, Day >= days_start)

  # The variable used to color and split the data should be a factor so lines are properly drawn
  sea_graph$Year <- factor(sea_graph$Year)

  # Lays out sea_graph by day with a colored line for each year
  sea_plot <- ggplot() + 
    ggtitle(paste("Incidence of Sea Ice in", sea_of_interest)) +
    geom_line(data = sea_graph, aes(x = Day, y = Sea_Ice, color = Year)) + 
    # Uncomment this if you have a sensible forecast algorithm.
    # geom_line(data = sea_plot, aes(x = Day, y = Predicted_Mean, color="Forecast")) + 
    scale_x_continuous()

  sea_plot
  ggsave(filename=paste0("./output/sea-ice-in-", sea_of_interest, 
                         "-plot-", sea_date, ".pdf"), plot=sea_plot)

  summary(sea_graph)
  sd(sea_graph$Sea_Ice)

  return(sea_graph[nrow(sea_graph), ncol(sea_graph)])
}

The New 8-bit Heroes

“While visiting his parents’ home in Central New York, Joe Granato discovered a box of forgotten illustrations, designed by he and other eight-year-old neighborhood friends—concepts for a video game for the original 8-bit Nintendo Entertainment System. He decided it might be fun to try to realize those ambitions.

But instead of creating it for a mobile device or modern console, he set out to use the same techniques and adhere to the same limitations that would have been employed in 1988 to make a new cartridge-based game actually playable on the now-archaic hardware.

Gathering a small team of modern creatives, what began as an explorative novelty project about building a video game for a system 30 years removed from relevance escalated to a two-year, ten-thousand-mile journey into an esoteric subculture made up of devotees to creating new NES games; artists who thrive on limitation.”

The New 8-bit Heroes

h/t Tedium.