spikeforest2 ironclust update procedure

[Compile ironclust2]
1. Run `irc_mcc` in `ironclust/matlab`
2. Run `copyfile run_irc ~/src/spikeforest2/spikeforest2/sorters/ironclust/container`

[Build the docker in spikeforest2]
3. Run `cd-spikeforest2`
4. Run `cd spikeforest2/sorters/ironclust`
5. Edit ironclust version numbers in `_ironclust.py`
6. Run `cd container`
5. Edit ironclust version numbers in `build_docker.sh` and `push_docker.sh`
5. Run `./build_docker.sh && ./push_docker.sh`

[Test and push to git]
6. Run `cd ~/src/spikeforest2/examples` 
7. Run `python example_ironclust.py` (set `HITHER_USE_SINGULARITY=TRUE` in .bashrc)
8. Run `git add -u. && git commit -m "ironclust updated" && git push`

linux exit error codes

/usr/include/asm/errno.h

#define EPERM            1      /* Operation not permitted */
#define ENOENT           2      /* No such file or directory */
#define ESRCH            3      /* No such process */
#define EINTR            4      /* Interrupted system call */
#define EIO              5      /* I/O error */
#define ENXIO            6      /* No such device or address */
#define E2BIG            7      /* Arg list too long */
#define ENOEXEC          8      /* Exec format error */
#define EBADF            9      /* Bad file number */
#define ECHILD          10      /* No child processes */
#define EAGAIN          11      /* Try again */
#define ENOMEM          12      /* Out of memory */
#define EACCES          13      /* Permission denied */
#define EFAULT          14      /* Bad address */
#define ENOTBLK         15      /* Block device required */
#define EBUSY           16      /* Device or resource busy */
#define EEXIST          17      /* File exists */
#define EXDEV           18      /* Cross-device link */
#define ENODEV          19      /* No such device */
#define ENOTDIR         20      /* Not a directory */
#define EISDIR          21      /* Is a directory */
#define EINVAL          22      /* Invalid argument */
#define ENFILE          23      /* File table overflow */
#define EMFILE          24      /* Too many open files */
#define ENOTTY          25      /* Not a typewriter */
#define ETXTBSY         26      /* Text file busy */
#define EFBIG           27      /* File too large */
#define ENOSPC          28      /* No space left on device */
#define ESPIPE          29      /* Illegal seek */
#define EROFS           30      /* Read-only file system */
#define EMLINK          31      /* Too many links */
#define EPIPE           32      /* Broken pipe */
#define EDOM            33      /* Math argument out of domain of func */
#define ERANGE          34      /* Math result not representable */
#define EDEADLK         35      /* Resource deadlock would occur */
#define ENAMETOOLONG    36      /* File name too long */
#define ENOLCK          37      /* No record locks available */
#define ENOSYS          38      /* Function not implemented */
#define ENOTEMPTY       39      /* Directory not empty */
#define ELOOP           40      /* Too many symbolic links encountered */
#define EWOULDBLOCK     EAGAIN  /* Operation would block */
#define ENOMSG          42      /* No message of desired type */
#define EIDRM           43      /* Identifier removed */
#define ECHRNG          44      /* Channel number out of range */
#define EL2NSYNC        45      /* Level 2 not synchronized */
#define EL3HLT          46      /* Level 3 halted */
#define EL3RST          47      /* Level 3 reset */
#define ELNRNG          48      /* Link number out of range */
#define EUNATCH         49      /* Protocol driver not attached */
#define ENOCSI          50      /* No CSI structure available */
#define EL2HLT          51      /* Level 2 halted */
#define EBADE           52      /* Invalid exchange */
#define EBADR           53      /* Invalid request descriptor */
#define EXFULL          54      /* Exchange full */
#define ENOANO          55      /* No anode */
#define EBADRQC         56      /* Invalid request code */
#define EBADSLT         57      /* Invalid slot */

#define EDEADLOCK       EDEADLK

#define EBFONT          59      /* Bad font file format */
#define ENOSTR          60      /* Device not a stream */
#define ENODATA         61      /* No data available */
#define ETIME           62      /* Timer expired */
#define ENOSR           63      /* Out of streams resources */
#define ENONET          64      /* Machine is not on the network */
#define ENOPKG          65      /* Package not installed */
#define EREMOTE         66      /* Object is remote */
#define ENOLINK         67      /* Link has been severed */
#define EADV            68      /* Advertise error */
#define ESRMNT          69      /* Srmount error */
#define ECOMM           70      /* Communication error on send */
#define EPROTO          71      /* Protocol error */
#define EMULTIHOP       72      /* Multihop attempted */
#define EDOTDOT         73      /* RFS specific error */
#define EBADMSG         74      /* Not a data message */
#define EOVERFLOW       75      /* Value too large for defined data type */
#define ENOTUNIQ        76      /* Name not unique on network */
#define EBADFD          77      /* File descriptor in bad state */
#define EREMCHG         78      /* Remote address changed */
#define ELIBACC         79      /* Can not access a needed shared library */
#define ELIBBAD         80      /* Accessing a corrupted shared library */
#define ELIBSCN         81      /* .lib section in a.out corrupted */
#define ELIBMAX         82      /* Attempting to link in too many shared libraries */
#define ELIBEXEC        83      /* Cannot exec a shared library directly */
#define EILSEQ          84      /* Illegal byte sequence */
#define ERESTART        85      /* Interrupted system call should be restarted */
#define ESTRPIPE        86      /* Streams pipe error */
#define EUSERS          87      /* Too many users */
#define ENOTSOCK        88      /* Socket operation on non-socket */
#define EDESTADDRREQ    89      /* Destination address required */
#define EMSGSIZE        90      /* Message too long */
#define EPROTOTYPE      91      /* Protocol wrong type for socket */
#define ENOPROTOOPT     92      /* Protocol not available */
#define EPROTONOSUPPORT 93      /* Protocol not supported */
#define ESOCKTNOSUPPORT 94      /* Socket type not supported */
#define EOPNOTSUPP      95      /* Operation not supported on transport endpoint */
#define EPFNOSUPPORT    96      /* Protocol family not supported */
#define EAFNOSUPPORT    97      /* Address family not supported by protocol */
#define EADDRINUSE      98      /* Address already in use */
#define EADDRNOTAVAIL   99      /* Cannot assign requested address */
#define ENETDOWN        100     /* Network is down */
#define ENETUNREACH     101     /* Network is unreachable */
#define ENETRESET       102     /* Network dropped connection because of reset */
#define ECONNABORTED    103     /* Software caused connection abort */
#define ECONNRESET      104     /* Connection reset by peer */
#define ENOBUFS         105     /* No buffer space available */
#define EISCONN         106     /* Transport endpoint is already connected */
#define ENOTCONN        107     /* Transport endpoint is not connected */
#define ESHUTDOWN       108     /* Cannot send after transport endpoint shutdown */
#define ETOOMANYREFS    109     /* Too many references: cannot splice */
#define ETIMEDOUT       110     /* Connection timed out */
#define ECONNREFUSED    111     /* Connection refused */
#define EHOSTDOWN       112     /* Host is down */
#define EHOSTUNREACH    113     /* No route to host */
#define EALREADY        114     /* Operation already in progress */
#define EINPROGRESS     115     /* Operation now in progress */
#define ESTALE          116     /* Stale NFS file handle */
#define EUCLEAN         117     /* Structure needs cleaning */
#define ENOTNAM         118     /* Not a XENIX named type file */
#define ENAVAIL         119     /* No XENIX semaphores available */
#define EISNAM          120     /* Is a named type file */
#define EREMOTEIO       121     /* Remote I/O error */
#define EDQUOT          122     /* Quota exceeded */

#define ENOMEDIUM       123     /* No medium found */
#define EMEDIUMTYPE     124     /* Wrong medium type */

irc2 speed test (fGpu={0,1} x fParfor={0,1})

# Environment
– XEON 20 cores 3.4GB
– 256GB ram
– SSD

# GPU ON, parfor ON
Recording format
Recording file: C:\tmp\groundtruth\hybrid_synth\drift_siprobe\rec_64c_1200s_11\raw.mda
Probe file: C:\tmp\groundtruth\hybrid_synth\drift_siprobe\rec_64c_1200s_11\geom.csv
Recording Duration: 1200.0s
Data Type: int16
#Channels in file: 64
#Sites: 64
#Shanks: 1
Pre-processing
Filter type: bandpass
Filter range (Hz): 300.0-6000.0
Common ref: mean
FFT threshold: 8
Events
#Spikes: 497473
Feature extracted: gpca
#Sites/event: 14
maxDist_site_um: 50
maxDist_site_spk_um: 75
#Features/event: 20
Cluster
#Clusters: 96
#Unique events: 493550
min. spk/clu: 30
Cluster method: drift-knn
knn: 30
nTime_clu: 4
nTime_drift: 60
fSpatialMask_clu: 0
Auto-merge
delta_cut: 1.000
maxWavCor: 0.990
Runtime (s)
Detect + feature (s): 49.9s
Cluster runtime (s): 9.3s
merge runtime (s): 12.4s
Total runtime (s): 71.6s
Runtime speed: x16.8 realtime
memory usage (GiB): 4.716
detect(GiB): 4.716
sort(GiB): 0.742
Execution
fGpu (GPU use): 1
fParfor (parfor use): 1
Parameter file: C:\tmp\irc2\hybrid_synth\drift_siprobe\rec_64c_1200s_11\raw_geom.prm

# GPU ON, parfor OFF
Recording format
Recording file: C:\tmp\groundtruth\hybrid_synth\drift_siprobe\rec_64c_1200s_11\raw.mda
Probe file: C:\tmp\groundtruth\hybrid_synth\drift_siprobe\rec_64c_1200s_11\geom.csv
Recording Duration: 1200.0s
Data Type: int16
#Channels in file: 64
#Sites: 64
#Shanks: 1
Pre-processing
Filter type: bandpass
Filter range (Hz): 300.0-6000.0
Common ref: mean
FFT threshold: 8
Events
#Spikes: 497473
Feature extracted: gpca
#Sites/event: 14
maxDist_site_um: 50
maxDist_site_spk_um: 75
#Features/event: 20
Cluster
#Clusters: 96
#Unique events: 493550
min. spk/clu: 30
Cluster method: drift-knn
knn: 30
nTime_clu: 4
nTime_drift: 60
fSpatialMask_clu: 0
Auto-merge
delta_cut: 1.000
maxWavCor: 0.990
Runtime (s)
Detect + feature (s): 116.5s
Cluster runtime (s): 19.6s
merge runtime (s): 12.4s
Total runtime (s): 148.5s
Runtime speed: x8.1 realtime
memory usage (GiB): 0.989
detect(GiB): 0.989
sort(GiB): 0.618
Execution
fGpu (GPU use): 1
fParfor (parfor use): 0
Parameter file: C:\tmp\irc2\hybrid_synth\drift_siprobe\rec_64c_1200s_11\raw_geom.prm

# GPU OFF, parfor ON
Recording format
Recording file: C:\tmp\groundtruth\hybrid_synth\drift_siprobe\rec_64c_1200s_11\raw.mda
Probe file: C:\tmp\groundtruth\hybrid_synth\drift_siprobe\rec_64c_1200s_11\geom.csv
Recording Duration: 1200.0s
Data Type: int16
#Channels in file: 64
#Sites: 64
#Shanks: 1
Pre-processing
Filter type: bandpass
Filter range (Hz): 300.0-6000.0
Common ref: mean
FFT threshold: 8
Events
#Spikes: 497473
Feature extracted: gpca
#Sites/event: 14
maxDist_site_um: 50
maxDist_site_spk_um: 75
#Features/event: 20
Cluster
#Clusters: 96
#Unique events: 493638
min. spk/clu: 30
Cluster method: drift-knn
knn: 30
nTime_clu: 4
nTime_drift: 60
fSpatialMask_clu: 0
Auto-merge
delta_cut: 1.000
maxWavCor: 0.990
Runtime (s)
Detect + feature (s): 71.4s
Cluster runtime (s): 11.0s
merge runtime (s): 12.1s
Total runtime (s): 94.6s
Runtime speed: x12.7 realtime
memory usage (GiB): 4.720
detect(GiB): 4.720
sort(GiB): 0.700
Execution
fGpu (GPU use): 0
fParfor (parfor use): 1
Parameter file: C:\tmp\irc2\hybrid_synth\drift_siprobe\rec_64c_1200s_11\raw_geom.prm

# GPU OFF, parfor OFF
Recording format
Recording file: C:\tmp\groundtruth\hybrid_synth\drift_siprobe\rec_64c_1200s_11\raw.mda
Probe file: C:\tmp\groundtruth\hybrid_synth\drift_siprobe\rec_64c_1200s_11\geom.csv
Recording Duration: 1200.0s
Data Type: int16
#Channels in file: 64
#Sites: 64
#Shanks: 1
Pre-processing
Filter type: bandpass
Filter range (Hz): 300.0-6000.0
Common ref: mean
FFT threshold: 8
Events
#Spikes: 497473
Feature extracted: gpca
#Sites/event: 14
maxDist_site_um: 50
maxDist_site_spk_um: 75
#Features/event: 20
Cluster
#Clusters: 96
#Unique events: 493638
min. spk/clu: 30
Cluster method: drift-knn
knn: 30
nTime_clu: 4
nTime_drift: 60
fSpatialMask_clu: 0
Auto-merge
delta_cut: 1.000
maxWavCor: 0.990
Runtime (s)
Detect + feature (s): 236.1s
Cluster runtime (s): 48.3s
merge runtime (s): 12.0s
Total runtime (s): 296.4s
Runtime speed: x4.0 realtime
memory usage (GiB): 1.013
detect(GiB): 1.013
sort(GiB): 0.614
Execution
fGpu (GPU use): 0
fParfor (parfor use): 0
Parameter file: C:\tmp\irc2\hybrid_synth\drift_siprobe\rec_64c_1200s_11\raw_geom.prm

Flatiron weekly progress: Sep30-Oct4

# ironclust v2
– […] run memory benchmark
– […] update spikeforest website benchmark
– [x] plot quality comparison
– [x] update the spikeforest wrapper

# Dan English
– create SNR distribution plot
– compare with others dataset

# paper writing
– jeremy flow chart
– contribute to spikeforest

# misc
– [x] ottawa travel reimbursement

# Computer maintenance
## Ubuntu
[x] VNC viewer installation
– [x] yakuake terminal sudo apt-get install yakuake
– [x] vscode (got stuck, can’t install code on terminal)

## Windows
– [x] Windows 10 install on moneyboxwin
– [x] office 365, TreeSizeFree, KarenReplicator
– [x] Copy 5GB backup drive

## Disk drive
– […] Initialize RAID48GB
– [ ] Copy recordings to 48GB
– [ ] Copy personal files to 48GB
– [ ] Build 60 GB Linux partition, put in recordings
– [x] Build 48 GB backup drive (RAID5), put in all recordings

irc2 development log

auto-merge: using feature RMS instead of waveform correlation

dataset: hybrid_janelia_static
(64ch, 1200s, 72 units, 30KS/s)
fGpu=0fGpu=1
fParfor=0Runtime (s):
Detect + feature (s): 132.0s
Cluster (s): 94.0s
Automerge (s): 17.4s
Total runtime (s): 243.4s
Runtime speed x4.9 realtime
memory usage (GiB):
detect(GiB): 0.900
sort(GiB): 0.380
Runtime (s):
Detect + feature (s): 57.5s
Cluster (s): 30.6s
Automerge (s): 19.6s
Total runtime (s): 107.6s
Runtime speed x11.2 realtime
memory usage (GiB):
detect(GiB): 1.090
sort(GiB): 0.482
fParfor=1
(4 local workers)
Runtime (s):
Detect + feature (s): 86.9s
Cluster (s): 48.7s
Automerge (s): 14.1s
Total runtime (s): 149.7s
Runtime speed x8.0 realtime
memory usage (GiB):
detect(GiB): 4.192
sort(GiB): 0.577
CRASHED
fParfor=1
(20 local workers)
Runtime (s):
Detect + feature (s): 76.1s
Cluster (s): 22.1s
Automerge (s): 10.7s
Total runtime (s): 108.9s
Runtime speed x11.0 realtime
memory usage (GiB):
detect(GiB): 4.169
sort(GiB): 0.560
CRASHED
fParfor=1
(20 remote workers)
Runtime (s):
Detect + feature (s): 58.4s
Cluster (s): 19.2s
Automerge (s): 9.3s
Total runtime (s): 86.9s
Runtime speed x13.8 realtime
memory usage (GiB):
detect(GiB): 4.221
sort(GiB): 0.743
**-p gpu=”gpures:2″**
Runtime (s):
Detect + feature (s): 38.2s
Cluster (s): 12.4s
Automerge (s): 9.9s
Total runtime (s): 60.5s
Runtime speed x19.8 realtime
memory usage (GiB):
detect(GiB): 4.174
sort(GiB): 0.334

irc2 post merging using position and amplitude of clusters

use gaussian kernel smoothing (make sure i get half a fall off at half the mindist). normalize by projecting a uniform field and ensure uniform field back.

advantage of this approach is robustness to where the peak site is located in determining the peak location.

gaussian kernel convolved, maximum slope at the minimum separation distance (sigma=d_min)
inferring spike position using PC1 is more precise than using other components
Great study music helping me to focus

irc2 development log

# fixed automerging issue
– Spike indexing was incorrect when extracting trPc 3D array.
– waveform shifting produced comparable result

# todo
– [x] compute rho and delta using parallel resources
– [x] compare performance between irc and irc2
– [ ]add drift correction and compare drift performance

# Runtime comparison
– dataset: static_siprobe\rec_64c_1200s_11
– irc.m: 123s, mean accuracy: 89.8, 62 above .8 accuracy, 1.8GB
– irc2.m (fGpu=1,fParfor=0): 64s, mean accuracy: 89.3, 62 above .8 accuracy, .776GB

# irc2.m speed test (drift correction not implemented yet)
-fGpu=1, fParfor=0: 64s
-fGpu=0, fParfor=0: 389s
-fGpu=0, fParfor=1: 158.6s (20 nodes, local)

# irc2.m test on linux workstation
-fGpu=0, fParfor=0: 292.5s
-fGpu=0, fParfor=1: 133s (20 nodes, remote)
-fGpu=1, fParfor=0: 45s

Losing weight by walking to work and back

2019 Sep 22: 93 KG

action: walked for an hour to get to work. Will do the same on the way back. That’s two hours of walking per day. I will also save $20 a day not taking ferry. I will lose 1KG, save $20, and spend extra 1 hour a day commuting to work. and back. In a month I will be 30 KG lighter and $600 richer.

sep 19 2019 @ flatiron

Start: 10:30 AM
Goals: memory loop plot, data backup, v4.9.5 debug with bapun,

# Memory test status
Still going. the param_set2.prm include cached results so I need to consider file read time, which should be about 1GB/s.

# Bapun v4.9.5 vs v4.9.11 comparison
No obvious difference. Formatting issue is suspected. Run his dataset tomorrow using his own parameter. Also run using makeprm command

# Dan English Dataset library
Not downloaded yet. I should do this after making parforeval command

# Disk backup
New 4-bay disk is setup. Each can hold 40TB (RAID0). My personal data will be RAID5 (30TB) and the recordings will be RAID0 and will be stored in CEPH. Linux gets 20TB of scratch (RAID0) to be managed by LVM. Eventually I will invest in 14TBx4 which gives 12TB extra with RAID5. This will come at $1600 price tag. I can only afford an enclosure (30TB) at this point. I will setup 10TB scrach drive in linux and 10TB in windows. I will keep a copy of the data at home to be used with my Lenovo.

## Final goal
– 40TB Hitachi RAID5 keep at home (enclosure ordered, fill with personal data)
– 40TB WD RAID0 keep @ work (filled with recordings, hardware RAID)
– 40TB WD RAID0 windows enclosure (temp data keeping purpose)
– 20TB Hitach keep in Moneybox (10TB for windows, 10TB for ubuntu)

## Data migration plan
– day0: empty 80TB WD RAID5 tower to 40TB RAID0 Hitachi (recordings) and 20TB Hitachi drives (personal)
– day1: copy RAID0 Hitachi (recordings) to CEPH via Globus (over the weekend, ~30TB)
– day2: bring 4-bay enclosure from home (Monday), add 40TB WD, copy from 40TB Hitachi RAID0 overnight
– day2: Change 40TB Hitachi to RAID5 and copy personal data (20TB Hitachi) overnight
– day3: Take 40TB Hitach RAID5 home, borrow 40TB WD to setup linux RAID