Links for the Week of 2021-08-29
Links for the Week of 2021-08-29
Here are some links I covered in the past week.
Table of Contents
- General
- Why I Hate Calendars
- Algo Trading
- Working With Audio Files and DeepSpeech in R and Python
- Developer Documentation as a Learning Tool
- AWS Tools: Amazon EMR
General
Setting up a new MacBook Pro
A complete setup of a MacBook Pro by an R data scientist. Check out the notes he made while waiting for Big Sur to download: Link - DB.
Setting up a Windows Machine - Scott Hanselmans 2021 Ultimate Developer and Power Users Tool List for Windows
R Stock Price Alert Tool
Set-up windows task scheduler to run this script every night to get an email with price alerts for multiple stocks.
rVest Webscraping Cheat Sheet
tags: cheatsheet
R Markdown Websites
See also the rmarkdown site generator section of the rmarkdown book: Link - DB.
2020 Wage Data for the State of Kansas
5 Metrics Every Dental Practice Should Be Tracking
58 Best Google Analytics Books of All Time
Why I Hate Calendars
I am terrible at keeping a calendar (just ask my wife). Calendars are obviously useful tools but something in my brain stops me from using them effectively.
I have issues with calendars:
- New Events Come At You Randomly and Are Hard to Process. You don’t know when or where you will be when you need to schedule an event on your calendar. For example, you are dropping your kid off at school, and he tells you he has to be at a basketball practice tomorrow at 7 p.m. before fleeing the car. You vaguely remember you have a fantasy football draft at the same time but you are not sure.
If you don’t have your beloved calendar in front of you, it’s a challenge to decide how to process this information: Are you available? Do you have a conflict? Is it likely a conflict will arise between then and now? If you have a conflict, which event should you prioritize (goodbye, fantasy football)? Is there a better alternative option? Who is available to cover this new appointment? Can’t your kid just play video games instead?
- There is No Metric. I get that we only have so many years/months/weeks/days/minutes/seconds in our lifetime, and that calendars exist to help organize, plan and prioritize those blocks of time. But calendars are basically just a data table with a clean shirt on. They log the data and then just sit there smiling aimlessly, all the while looking clean and organized and pretty. It’s lipstick on a pig. Calendars leave it to the user to decide what to do with all this data. You can manually schedule, re-schedule or delete future events but most of the time there is no way to tell whether these actions are effective in meeting your goals.
-
Calendars Can’t Predict the Future (Or Measure the Past). Calendars don’t remember where you have been, how you got where you are or forecast where you are going in the future.
-
Calendars Are Bad at Categorization. Unless you adopt your own system, there is no way to tell where each entry should be categorized in relation to anything else.
In light of all of these problems, I am seeking a better way to plan my time blocks. Below are some links that have helped me along the way.
A Mini Calendar in Your R Console
How to get a calendar for any range of month-years
in the R console.
Why You Should Use a Calendar
Algorithmic Trading
24x5 AI Stock Trading agent to predict stock prices - Live trading
In-depth review of creating an algo trading bot hosted on AWS. Mostly python but really good information.
Algorithmic Financial Trading with Deep Convolutional Neural Networks: Time Series to Image Conversion Approach
Python: I have tested a Trading Mathematical Technic in RealTime.
A Step-by-Step Implementation of a Trading Strategy in Python Using ARIMA Garch Models
Forecasting Stock Returns Using the ARIMA Model
Working With Audio Files and DeepSpeech in R and Python
Efficiently split a large audio file in R - Stack Overflow
DeepSpeech-examples/autosub at r0.9 · mozilla/DeepSpeech-examples
Excellent code that uses the DeepSpeech to create subtitles for videos. This code helped immensely when trying to split audio files into chunks to be transcribed by DeepSpeech. One tip was using the pyAudioAnalysis library to split up audio files at “silent” breaks. That way there are no words that are cut-off.
How to build Python transcriber using Mozilla DeepSpeech
Python code to build a DeepSpeech API and Batch API to build a streaming transcriber model. Based on DeepSpeech 0.6 but very relevant.
Deploying Mozilla DeepSpeech models to AWS Lambda using Serverless
This one is outside my comfort zone but still a good read.
Developer Documentation as a Learning Tool
As a new coder, one of the (many) challenges I encounter is understanding the workflow and process that it takes to develop good code. What decisions does someone developing an R package make along the way? How do they decide what to include and what not to include? How do they utilize the architecture of the project to support their end goal?
A new writer could review the works of other great authors, dissecting vocabulary, sentence structure, plotting, pacing, opening lines. Some do it manually. Data scientists have even statistically analyzed the great authors - DB. You can do the same for code; pick a popular repository and reverse engineer how the author(s) solved the issue they faced. Instead of becoming a better Googler, take the time to internalize what other good code does and how you can apply the bigger patterns to your own problem.
An even better next step as an author would be to analyze the process that a Stephen King or Dostoevsky or whoever writes those billionaire sex novels takes to develop his or her books. These are things that are not apparent from what is on the page. Thinking about the plot, outlining, writing down ideas and characters studies. Countless books like King’s On Writing talk about the process of developing a novel. But unless an author takes the time to write about their process, someone developing their skills is barred from learning the process.
Documentation is the closest thing a coder has to “looking behind the scenes” of how the actual code was developed. I would argue it provides key insight into why certain decisions were made and how the developer went about creating the code architecture. More to come on this topic in the future.
That is why the idea proposed and documented in a four-part series by R-developer Simon Couch is so intriguing: there is value in seeing the development process of code which in this case is the R package stacks. Couch documents how he wanted the package to work (and how to keep track of all of the objects), but he also discusses what he wants the user experience to be as an end result.
Part 1: Introduction
Part 2: Splitting Things Up
Part 3: Naming Things
Part 4: Big Things
More links to help with practicing coding:
- Project Euler. Link - DB
- Programming Not For You? Try Thinking. Link - DB
- Six Essential Language Agnostic Programming Books. Link - DB