Tag Archives: kernel bypass

ULL Ultra Low Latency Architectures for Electronic Trading – NYU Fall 2018

Increase Trading Profit with Deterministic Latencies

ULL (Ultra Low Latency) Architectures for Electronic Trading
NYU SPS Fall 2018 8 sessions (3 hours each)  –– Ted Hruzd
Tuesdays – Sep 18, 25 & Oct 2, 9, 16, 23, 30 & Nov 13 (6-9 pm): in-person, in-class

On-Line Registration – https://www.sps.nyu.edu/professional-pathways/topics/finance/asset-management-and-investment-strategies/FINA1-CE9515-ull-ultra-low-latency-architectures-for-electronic-trading.html#more-details

 

Course Objectives

Develop advanced skills in architecting electronic trading (ET) and market data applications for ultra low latency (ULL), for competitive advantage, and for positive ROI.  At end of course, one will have developed expertise in end-end architecture of ET applications and infrastructure, including:

  • roles of FPGA’s, GPU’s, over-clocked servers, and new high end Intel Skylake servers
  • kernel and NIC kernel bypass tuning,
  • options available for architecting ULL networks
  • network performance analysis via WireShark and Corvil
  • Latest BlockChain technologies with increasing relevance to ET
  • Application Architecture / Software Designs for ULL ET
  • Machine Learning, AI, Neural Networks including Recurrent & Recursive Neural Networks with R Studio, Python, H2O, Tensor Flow for basic alpha seeking – predictive analytics (no proprietary techniques – just raw pure technical skills to enable one to develop your own)

 

Week1: Accelerated Hardware Architectures

 

  • Tick-2-Trade applications with single digit micro seconds, even with sub 1 micro seconds
  • How to architect for deterministic latencies even in times of volume spikes
  • Why ‘Meta-Speed’ (info how to used speed) is more important than pure speed
  • Proper use of multi-layer ULL switches, FPGA’s, GPU’s, MicroWave technologies & over-clocked servers
  • Market Data Feed Handlers in FPGA; Order Books in Intel Cores or FPGA’s – achieve 20 x’s parallelization for full depth books?
  • Integration of FPGA’s and Intel cores via high speed caches, eventually FPGA’s and cores on same die (Intel-Altera current and upcoming enhancements)
  • FIX engines in FPGA based NIC’s
  • Multi core, high speed cache Intel based servers + Intel’s new MESH socket interconnects for ULL and deterministic memory I/O
  • Layer 1 and multi layer network switches (Metamako, ExaBlaze, xCelor, FixNetics)
  • Fundamentals of FPGA design and programming
  • ROI analysis

 

HOMEWORK:

  1. Ted will present 3 Visio ULL architectures of end-end trading systems and ask class to critique all infrastructure components individually and in the aggregate, along with 1st steps to start ROI analysis
  2. Ted will specify proportion/limits of app acceleration (ex FPGA/GPU) infrastructure to use for a new CoLo Trading app (many details will be provided such as SLA requirements, current market share, and projected market share). Students will design with Visio and in 1-2 pages (max) justify their architecture

 

Weeks 2-3: Tuning Linux Kernel and Network Kernal Bypass for ULL

 

  • Tick-2-Trade applications with single digit micro seconds, even with sub 1 micro seconds
  • How to architect for deterministic latencies even in times of volume spikes
  • Linux 7.3/4 kernel and NIC tuning for kernel bypass
  • Linux 7.5 (latest) – new “cpu-partitioning” profile
    • Ideal for ET apps with more processes and threads than cores?
    • Enabler of containers and micro services to run bare metal with ULL?
  • Kernel bypass technologies including RDMA and LDMA
  • Leading FPGA based NIC(s) – from SolarFlare, Mellanox, ExaBlaze, Enyx
  • SolarFlare Direct TCP

HOMEWORK:

  • Ted will prove links to documents for Deep understanding of Linux and NIC kernel tuning. This will lead to a 30 minute Test start of Week 4 class

 

 

Weeks 4-5: Network, Storage, Messaging Architectures for ULL, including Hands-On Corvil Analytics Training

 

  • 20-30 minute test covering weeks 2-3
  • ULL messaging middleware (29 West LBM/UME) and 60 East Tech AMPS
  • PTP architectures for large market data / trading application infrastructures
    • Role of accurate times to single digit nano seconds
  • Network appliances – detailed timings/analytics – network, market data, and order routing – Corvil, Instrumentix, SolarCapture
    • By end of 2nd week, attain advanced skills in use of Corvil analytics
    • Expect 2 week remote access to a Dev Corvil appliance
    • Expect in-class hands on exercises to master as much of Corvil as possible in 2 weeks
  • ULL Networks, including options for integrating multi-layer switches, FPGA appliances, new approaches to ULL multi-cast market data distribution
  • ULL storage networks, including NVMeOF fabrics, Intel Optane, 3D XPoint, EverSpin new MRAM deterministic memory + persistent storage options
  • How ULL deterministic memory can lower end-2-end latencies for subset of application flows, especially those based on ULL analytics
  • Correlation of ULL networks and fill rates
  • Network performance analysis via WireShark (Hands-on in class) and also via Corvil
  • Tools (some free, several with RH Linux) to attain network performance optimization insights

 

HOMEWORK:

  1. Design a new ULL infrastructure architecture per new info covered in weeks 4-5 (due in 2 weeks). Full details at end of class 4 and follow-up with progress start of class 5
  2. Hands on Corvil exercises to prepare one for 30 minute test start of Week 6

 

Week 6: Machine Learning & AI: increasing relevance in ULL ET

 

  • 30 minute test on network architectures including Corvil analytics
  • Return Architecture assignment along with Quick review of the most pertinent
  • Math behind multiple ML models, with deep dive into Neural Networks (NN)
  • Advanced NN (Recurrent)
  • ML / NN for seeking alpha via basic R and Python programming (more R than Python)
  • Supervised vs unsupervised ML
  • Synergies with Data Mining
  • Optimal Architectures for ML: Infrastructure, Software
  • Role of SME in ML & AI
  • Determining what model to choose
    • Intro to Lenovo’s new server optimized for ML / AI Time-2-Market, its CPU cores + GPU infrastructure, + numerous pre-installed ML solutions, including option to run few ML models in parallel to determine most accurate model
  • How to interpret results
  • How to verify models
  • Tensor Flow for parallelization of ML models
    • explore abilities of R & Python HPAT for parallelization + to vectorize code
  • How to tune, tweak models for greater accuracy and predictive value
  • ML and Event Stream processing, real time analytics for seeking alpha (trade opportunities)
  • Definition of Deep Learning (DL)
  • DL Models and use cases
  • Define AI; provide use cases
  • ML and DL as inputs to AI
  • Time-2-Market & ROI projections for ML / AI initiatives end-2-end
  • Best Practices in AI in our industry
  • In Class (Hands-On- all using R-Studio)
    • RStudio & H20
    • Portfolio analysis via Classification Model using R/H20
    • Predictive analysis of new trading strategies via Decision Trees
    • Pattern Recognition of Trading Patterns to provide am Alpha service for Buy-Side
    • Recursive Neural Net (RNN) to predict ET Fill rates and trading revenues

HOMEWORK:

  • Reading material and ML models to learn/analyze, in prep for 30 minute test start of Week
  • Reading Material in prep for Final Arch assignment which ill be presented end of Class 7, due for Class 8

 

 

Week 7: BlockChain Architectures: How to scale, approach Low Latency processing, attain ROI + New Software paradigms available for ULL Application Architectures (with Python, C++, Java Code examples

  • ULL software design (deep vectors ex Intel’s AVX-512 & multi threading – OpenMP, TBB, Intel C++ Studio)
  • Python development of basic Algo strategies & software design/analysis for back testing Algo’s
  • Intel’s new “HPAT” Python compiler with directives to parallelize Python code
  • Robin Systems – application aware fabric (all software – role for ULL analytics)?
  • Hot right now – Chronicle: a Java based microservices framework touting superior mem mgt + horizontal scalability for FIX Engines and more
    • Integration of RedHat Linux 7 “cpu-partition” profile for Chronicle micro services
    • Other microservices such as Docker Containers
    • Use of OpenShift and Kubernetes for ULL orchestration
      • Explore latest Kubernetes plug-in to SolarFlare OenOnLoad to accomplish the above
    • BlockChain – can common interests least to technology to benefit all & cut costs, speed up settlements – LL settlements, enhance TCA?
      • What electronic trading applications can integrate with BlockChain?
      • How to architect such applications
      • Learn BlockChain “Smart Contract” protocol
        • Assess how the above can scale BlockChain and render it “Low Latency” for Clearing, Compliance, even ET, starting with Wealth Mgt

HOMEWORK:

  • Specific Details, along with early phases of Blockchain Design options (Ted will forward) for completing Final Arch assignment
    • Goal will be an architecture to optimize a BlockChain architecture for Clearing, Settlement, Risk Analytics, & Trading

 

Week 8 – Software Defined Infrastructures (SDI) + Futures

  • Review of Final Architecture Assignment
    • Expect students to persuade instructor how assignment meets ROI and Time-2-Market criteria
  • Review of most significant aspects of class with class participation (be prepared to answer questions, offer solutions, identify gaps, etc)
  • Innovative, deterministic latency Market data routing – from start-up LightFleet
  • Updated status of Cadence ASICs goal to supplant FPGA’s as go-to hardware acceleration for most demanding ULL ET
  • SDN (Software Defined Networks) – when applicable for ULL trading applications

 

PreReq – (for most, expecting basic to intermediate expertise, unless noted)

  • Most important: at least 2 years working with electronic trading applications/infrastructures as Developer, SA, network admin/engineer, Architect, QA analyst, tech project mgr, operations engineer, manager, CTO, CIO, CEO, vendor or consultant providing technology to Wall Street IT,
  • TCP/IP, UDP, multicast (basic knowledge),
  • Linux OS and shell or scripting (ex bash, perl); at minimum basic familiarity of output and usefulness of core Linux commands such as sysctl –a, ethtool, ifconfig, top, ls, grep, awk, sed, and others listed later in this syllabus
  • Intel servers, cores, sockets, GHz clock speed, NUMA
  • Network routers, switches
  • 1 or more network protocols from BGP, OSPF, EIGRP, MPLS, IB
  • FIX protocol
  • Market Data, at minimum contents of equities consolidated feeds
  • Visio (will use for homework assignments; HOWEVER – to save time I will accept ‘pictures’ of white board architectures / designs)
  • R programming (nice to have. Will use basics that one can learn in 1-2 hours), then extend upon that in classes for class hands-on Machine Learning
  • Python (very basic will be fine – a 2 hour reading assignment will be arranged for beginners). We will use a text written for traders with zero programming experience that quickly trains them how to use small set of Python for creating trading algo’s

 

Course Logistics

  • 8 sessions @3 hours each 6-9 pm, NYU MidTown – 11 W 42nd Street, or @ NYU SPS 7 East 12th Street; Room TBD;
  • Tech book(s) (OPTIONAL) to download to kindle
    • NEW: Machine Learning with R – 2nd Edition by Brett Lantz, PACKT Publishers
    • OPTIONAL – Ultimate Algorithmic Trading Systems ToolBox, George Pruitt, Wiley, 2016
  • Multiple web site links to technical white papers and tech analyses (ex nextplatform.com, http://intelligenttradingtechnology.com/, http://datamanagementreview.com, www.tabbforum.com,  www.tradersmagazine.com
  • Visio OR simply draw on whiteboards, send pictures (some homework assignments)
  • Extensive use of white board by instructor and students. Sessions will present students with few infrastructures to architect per specific business success criteria
  • Grading:
    • 1/3 class participation in in-class architecture designs -white board sessions)
      • Every class will include several questions for students to answer and address, for Instructor to assess level of understanding of class. Also, Students will be encourage to be interactive in class and to propose options of how to address new architecture designs
    • 1/3 quizzes / tests
      • These will be used to assess students mastery of Technical details. 30 minute in-class tests will include multiple choice and few questions how students would address real world architecture technical gaps or new strategic initiatives
      • For second consecutive course, Expect remote access to a Development Corvil appliance for 2 weeks, with 14 days of remote logins to learn Corvil fundamentals in week 1 and more advanced topics week 2. corvil.com is market leader in providing both network architecture and analytical software to diagnose / resolve Electronic trading transaction problems.  This includes ability to propose infrastructure and application performance optimization recommendations to speed up trading with goal of increasing trading revenues.  This will be include on one test
    • 1/3 – Homework – visio, wireshark analysis, basic python algo programming
      • Architecture assignments must include 1-2 pagers succinctly detailing key design choices and aspects, along with persuasive arguments for the design. This will include ROI and time-2-market expectations.  Successful Architects in this field present designs & architectures to senior and C-Level executives.  Grading will be judged on students’ effectiveness not only in technical expertise but in writing skills.
      • Final assignment will include the above criteria and also 5 minute presentations to Instructor who will play role of key C-Level decision maker
        • Regarding expected class diversity – ex: prior 2 classes included very hands on tech SA’s, Engineers, Architects; however, these classes also included not as technical but business-savvy Project Managers (PMs).  In the end both sets mastered the new technical material quite equally.  However, Instructor was flexible in grading given this diversity.  PM homework assignments were judged more on providing technical plans for time-2-market, whereas the more technical students were judged more on their tech arguments for their proposed designs