Coastal Hazards @ Virginia Tech
Coastal Hazards @ Virginia Tech
  • Home
  • People
    • Jennifer L. Irish
    • Robert Weiss
    • Tina Dura
    • Current Students
    • Alumni
  • Research
  • Teaching
  • Center for Coastal Studies
  • Publications
  • Announcements
  • Open Source Software
  • Blog
  • Outreach
  • Home
  • People
    • Jennifer L. Irish
    • Robert Weiss
    • Tina Dura
    • Current Students
    • Alumni
  • Research
  • Teaching
  • Center for Coastal Studies
  • Publications
  • Announcements
  • Open Source Software
  • Blog
  • Outreach

Vt Coastal Blog:Exploring the quantitative world

Freaking out about your child? Stay calm and use your skills!

1/4/2017

0 Comments

 

We were settling into the Christmas break, I realized something: There will be no daycare until January! While this on its own is not so much of a problem, but we were both looking forward to some quiet time. However, it got worse. But before it got worse it got better because the first couple of days that our son Linus was home with us, he slept in. So we got to sleep in as well, which was needed the first couple of days. Then I started to wake up around 7:30 or so and get up. I expected Linus to get up at some point too. I actually enjoyed to quiet time in the mornings. A few more days went by like this, and I got used the quiet time in the morning. However of course this quiet time was ruined.

The most common thought would at this stage of the narrative would, of course, be that Linus started to wake up at 5:30 or so. On the contrarary, he continued to sleep until we woke him up at 10. Furthermore, he also slept a lot for his nap. One day he asked if I can wake him up for his bedtime..perhaps he was realizing as well that he was sleeping a lot. As his parental units got really worried and did what one does: we googled around for 20 minutes and became instant sleep experts. We looked through the usual chat rooms, finding comments ranging between "growth spurt" and "emergency room", and found a plot with normal, recommended and not recommended. One becomes very used to this kind of information, but it also is not very helpful.

Later that day as the information I read earlier sank in, I thought: "How worried do I have to be?" I had two options: (a) eat a whole bunch of cookies and drink enough wine to forget this topic until after the holidays, or (b) look for some data and try to see how abnormal that really is. I decided to the latter. So I started my search for sleep-related data...it turns out this data is hard to find. After an hour or so scanning different different manuscripts on google scholar, I found Williams , J.A., Zimmerman, F.J., Bell , J.F. (2013): Norms and trends of sleep time among US children and adolescents, Journal of the American Medical Association - Pediatrics, 167(1), 55 -60, which contained a figure with the 10$^{th}$ , 25$^{th}$ , 50$^{th}$ , 75$^{th}$ , and 90$^{th}$ percentile of night sleep during the weekdays, weekends for ages between 0.5 and 18 years, and daytime sleep for weekend and weekdays for the ages between 0.5 and 5 years. I digitized the data and started playing with it.

Let's start: ...

We have to start with the usual loading of modules in jupyter:

In [1]:
%pylab inline
from scipy.interpolate import interp1d
from scipy.special import erf
import csv
import random
import warnings
matplotlib.rcParams.update({'font.size': 18,'legend.fontsize':15,'font.family': 'serif'})
warnings.filterwarnings('ignore')
Populating the interactive namespace from numpy and matplotlib

Data¶

The data that I digitized are in different files: {A:F}.cvs. Each file contains 6 colunms where the first column is age and the subsequent columns are for the different percentile plotted in Williams et al. (2013). The figure below depicts the data for A.cvs, representing the weekday night sleep.

In [2]:
file='A.csv'
aage=[]
a10=[]
a25=[]
a50=[]
a75=[]
a90=[]
f = open(file)
csv_f = csv.reader(f)
for row in csv_f:
    aage.append(float(row[0]))
    a10.append(float(row[1]))
    a25.append(float(row[2]))
    a50.append(float(row[3]))
    a75.append(float(row[4]))
    a90.append(float(row[5]))
fig0,ax0 = subplots(1,figsize=(10,8.1))
ax0.plot(aage,a10,label=r'$10^{th}$')
ax0.plot(aage,a25,label=r'$25^{th}$')
ax0.plot(aage,a50,label=r'$50^{th}$')
ax0.plot(aage,a75,label=r'$75^{th}$')
ax0.plot(aage,a90,label=r'$90^{th}$')
ax0.set_xlim([0.5,18])
ax0.legend(loc='upper left', bbox_to_anchor=(0.05,1.),title='Percentile' ,frameon=False,fancybox=True, shadow=False, ncol=5, numpoints=1)
ax0.set_xlabel('Age [yrs]')
ax0.set_ylabel('Night Sleep [hrs]')
Out[2]:
<matplotlib.text.Text at 0x7f4ac906c7d0>

Helper functions¶

The next few cells contain some scripts to help with the model.

In [3]:
def random_floats(low, high, size):
    # returns an array of random numbers of size 'size' between 'low' and 'hihg'.
    return [random.uniform(low, high) for _ in xrange(size)]

def error_th(arr,brr,thres):
    nnx=len(arr)
    new_a=[]
    for i_n in range(nnx):
        if arr[i_n]<=thres:
            new_a=concatenate((new_a,brr[i_n,:]))
    new_a=array(new_a)           
    new_a=reshape(new_a, (len(new_a)/len(brr[0,:]), len(brr[0,:])))                  
    return new_a

The second function (lines 5 to 13) in the cell above is sort of interesting, but essentially just helps to keep those runs whose L$_2$ norm is smaller than the threshold 'thres'. I use the L$_2$ norm to lower find suitable realization that describe the data.

The cell below contains more helper functions.

In [4]:
def find_sleep_distribution(gage,nnage,nnc10,nnc25,nnc50,nnc75,nnc90):
    len_age=len(nnage)
    for i in range(len_age):
        if gage>=nnage[i]:
            age_index=i
            break
    distri_y=[10,25,50,75,90]
    distri_x=[nnc10[age_index],nnc25[age_index],nnc50[age_index],nnc75[age_index],nnc90[age_index]]
    return distri_x, distri_y

def find_percent(linus_sleep,nnage,xx):
    len_age=len(nnage)
#    print len_age
    for i in range(len_age):
        if nnage[i]>=linus_sleep:
            h_index=i
            break
#    print h_index
    len_range=len(xx[:,0])
    percent=[]
#    print len_range,shape(xx)
    for j in range(len_range):
        percent.append(float(xx[j,h_index]))
    percent=array(percent)
    return percent

def inter(x,y,nx):
    f = interp1d(x, y,'cubic')
    ny=f(nx)
    return ny

def fitFunc(x,a,b):
    return 0.5*(1.+erf((x-a)/(b*sqrt(2.))))

The last function (lines 32 and 33) in the cell are interesting because they reveal that we use the error function to approximate the cummulative distributation given by the data in Williams et al. (2013).

The cell below contains some basic information, like Linus' age and the amount he sleeps during the day and night on weekdays and on weekends. If you try this with your child, this is where you need to change things.

In [5]:
linus_age=4.5
naage=linspace(0.5,18.5,73)
nbage=linspace(0.5,5.0,21)

weekday_night_hours=11.
weekend_night_hours=12.

weekday_day_hours=1.0
weekend_day_hours=1.0

sim_hours_n=linspace(0,20,41)
sim_hours_d=linspace(0,10,11)

Method¶

Weekdays: Night¶

In [6]:
file='A.csv'
aage=[]
a10=[]
a25=[]
a50=[]
a75=[]
a90=[]
f = open(file)
csv_f = csv.reader(f)
for row in csv_f:
    aage.append(float(row[0]))
    a10.append(float(row[1]))
    a25.append(float(row[2]))
    a50.append(float(row[3]))
    a75.append(float(row[4]))
    a90.append(float(row[5]))
aage=array(aage)
a10=array(a10)
a25=array(a25)
a50=array(a50)
a75=array(a75)
a90=array(a90)
na10=inter(aage,a10,naage)
na25=inter(aage,a25,naage)
na50=inter(aage,a50,naage)
na75=inter(aage,a75,naage)
na90=inter(aage,a90,naage)
a_x,a_y = find_sleep_distribution(linus_age,naage,na10,na25,na50,na75,na90)
a_x=array(a_x)
a_y=array(a_y)/100.
sim_hours=linspace(0,20,41)
asigma=random_floats(1.5, 3.5, 10000)
amu=random_floats(a_x[2]-0.5, a_x[2]+0.5, 10000)
dummycases=zeros([len(asigma),len(sim_hours_n)])
nerror=zeros([len(asigma)])
for i in range(len(asigma)):
    dummycases[i,:]=fitFunc(sim_hours_n,amu[i],asigma[i])
    dummy=inter(sim_hours_n,dummycases[i,:],a_x)
    nerror[i] = (dot(a_y-dummy,a_y-dummy))**.5
acases=error_th(nerror,dummycases,0.05)
weekday_night=find_percent(weekday_night_hours,sim_hours_n,acases)

The first 30 lines in the cell above deal with the data. Line 32 creates the $\sigma$ array for the error function, and line 33 the respective $\mu$. You see that the vectors for both have 10000 elements, resulting in 10000 realizations. The loop from line 36 to 39 represents the model realizations. Line 40 is interesting as well because uses the L$_2$ to limit the realizations employed to caluclate the percentile for the weekday nights (last line). The third argument (0.04) in line 40 is the threshold L$_2$ norm. Realizations with a larger L$_2$ will be discarded. The L$_2$ norm is computed between the data and the model, of course. The figure below shows the results.

In [7]:
fig1,ax1 = subplots(1,figsize=(10,8.1))
for i in range(len(acases)):
    ax1.plot(sim_hours,acases[i,:],'grey');
ax1.plot(a_x,a_y,'k-',lw=2);
ax1.set_xlim([5,20]);
ax1.set_xlabel("Sleep [hrs]");
ax1.set_ylabel("Percentile of kids");

We see from the figure above that the model realizations envelope around the data. That is what I wanted.

Now repeat this with all the other elements...

Total: Night¶

In [8]:
file='C.csv'
cage=[]
c10=[]
c25=[]
c50=[]
c75=[]
c90=[]
f = open(file)
csv_f = csv.reader(f)
for row in csv_f:
    cage.append(float(row[0]))
    c10.append(float(row[1]))
    c25.append(float(row[2]))
    c50.append(float(row[3]))
    c75.append(float(row[4]))
    c90.append(float(row[5]))
cage=array(cage)
c10=array(c10)
c25=array(c25)
c50=array(c50)
c75=array(c75)
c90=array(c90)
nc10=inter(cage,c10,naage)
nc25=inter(cage,c25,naage)
nc50=inter(cage,c50,naage)
nc75=inter(cage,c75,naage)
nc90=inter(cage,c90,naage)

c_x,c_y = find_sleep_distribution(linus_age,naage,nc10,nc25,nc50,nc75,nc90)
c_x=array(c_x)
c_y=array(c_y)/100.
csigma=random_floats(0.1, 10.5, 10000)
cmu=random_floats(c_x[2]-0.5, c_x[2]+0.5, 10000)
dummycases=zeros([len(csigma),len(sim_hours_n)])
nerror=zeros([len(csigma)])
for i in range(len(csigma)):
    dummycases[i,:]=fitFunc(sim_hours_n,cmu[i],csigma[i])
    dummy=inter(sim_hours,dummycases[i,:],c_x)
    nerror[i] = (dot(c_y-dummy,c_y-dummy))**.5
ccases=error_th(nerror,dummycases,0.05)  
totaltime=1./7.*(5.*weekday_night_hours + 2.*weekend_night_hours)
total_night=find_percent(totaltime,sim_hours_n,ccases)

The only difference in this cell compared to cell 6 is the line 41, which computes the average time of sleep computed from weekday and weekend night sleep.

Weekend: Night¶

In [9]:
file='B.csv'
bage=[]
b10=[]
b25=[]
b50=[]
b75=[]
b90=[]
f = open(file)
csv_f = csv.reader(f)
for row in csv_f:
    bage.append(float(row[0]))
    b10.append(float(row[1]))
    b25.append(float(row[2]))
    b50.append(float(row[3]))
    b75.append(float(row[4]))
    b90.append(float(row[5]))
bage=array(bage)
b10=array(b10)
b25=array(b25)
b50=array(b50)
b75=array(b75)
b90=array(b90)
nb10=inter(bage,b10,naage)
nb25=inter(bage,b25,naage)
nb50=inter(bage,b50,naage)
nb75=inter(bage,b75,naage)
nb90=inter(bage,b90,naage)

b_x,b_y = find_sleep_distribution(linus_age,naage,nb10,nb25,nb50,nb75,nb90)
b_x=array(b_x)
b_y=array(b_y)/100.
bsigma=random_floats(1, 3.6, 10000)
bmu=random_floats(b_x[2]-0.5, b_x[2]+0.5, 10000)
#acases=zeros([len(asigma),len(naage)])
dummycases=zeros([len(bsigma),len(sim_hours_n)])
#dummy=zeros([len(asigma),len(a_x)])
nerror=zeros([len(bsigma)])
m=-1
for i in range(len(bsigma)):
    dummycases[i,:]=fitFunc(sim_hours_n,bmu[i],bsigma[i])
    dummy=inter(sim_hours_n,dummycases[i,:],b_x)
    #nerror[i]=norm(a_y - dummy)
    nerror[i] = (dot(b_y-dummy,b_y-dummy))**.5
bcases=error_th(nerror,dummycases,0.04)    
weekend_night=find_percent(weekend_night_hours,sim_hours_n,bcases)

Weekday: Day¶

In [10]:
file='D.csv'
dage=[]
d10=[]
d25=[]
d50=[]
d75=[]
d90=[]
f = open(file)
csv_f = csv.reader(f)
for row in csv_f:
    dage.append(float(row[0]))
    d10.append(float(row[1]))
    d25.append(float(row[2]))
    d50.append(float(row[3]))
    d75.append(float(row[4]))
    d90.append(float(row[5]))
dage=array(dage)
d10=array(d10)
d25=array(d25)
d50=array(d50)
d75=array(d75)
d90=array(d90)
nd10=inter(dage,d10,nbage)
nd25=inter(dage,d25,nbage)
nd50=inter(dage,d50,nbage)
nd75=inter(dage,d75,nbage)
nd90=inter(dage,d90,nbage)
d_x,d_y = find_sleep_distribution(linus_age,nbage,nd10,nd25,nd50,nd75,nd90)
d_x=array(d_x)
d_y=array(d_y)/100.
dsigma=random_floats(1., 2.9, 10000)
dmu=random_floats(d_x[2]-0.5, d_x[2]+0.5, 10000)
dummycases=zeros([len(dsigma),len(sim_hours_d)])
nerror=zeros([len(dsigma)])
for i in range(len(dsigma)):
    dummycases[i,:]=fitFunc(sim_hours_d,dmu[i],dsigma[i])
    dummy=inter(sim_hours_d,dummycases[i,:],d_x)
    nerror[i] = (dot(d_y-dummy,d_y-dummy))**.5
dcases=error_th(nerror,dummycases,0.15)    
weekday_day=find_percent(weekday_day_hours,sim_hours_d,dcases)
In [11]:
file='E.csv'
eage=[]
e10=[]
e25=[]
e50=[]
e75=[]
e90=[]
f = open(file)
csv_f = csv.reader(f)
for row in csv_f:
    eage.append(float(row[0]))
    e10.append(float(row[1]))
    e25.append(float(row[2]))
    e50.append(float(row[3]))
    e75.append(float(row[4]))
    e90.append(float(row[5]))
eage=array(eage)
e10=array(e10)
e25=array(e25)
e50=array(e50)
e75=array(e75)
e90=array(e90)
ne10=inter(eage,e10,nbage)
ne25=inter(eage,e25,nbage)
ne50=inter(eage,e50,nbage)
ne75=inter(eage,e75,nbage)
ne90=inter(eage,e90,nbage)

e_x,e_y = find_sleep_distribution(linus_age,nbage,ne10,ne25,ne50,ne75,ne90)
e_x=array(e_x)
e_y=array(e_y)/100.
esigma=random_floats(1., 2.9 ,10000)
emu=random_floats(e_x[2]-0.5, e_x[2]+0.5, 10000)
dummycases=zeros([len(esigma),len(sim_hours_d)])
nerror=zeros([len(esigma)])
for i in range(len(esigma)):
    dummycases[i,:]=fitFunc(sim_hours_d,emu[i],esigma[i])
    dummy=inter(sim_hours_d,dummycases[i,:],e_x)
    nerror[i] = (dot(e_y-dummy,e_y-dummy))**.5
ecases=error_th(nerror,dummycases,0.15)    
weekend_day=find_percent(weekend_day_hours,sim_hours_d,ecases)
In [12]:
file='F.csv'
fage=[]
f10=[]
f25=[]
f50=[]
f75=[]
f90=[]
f = open(file)
csv_f = csv.reader(f)
for row in csv_f:
    fage.append(float(row[0]))
    f10.append(float(row[1]))
    f25.append(float(row[2]))
    f50.append(float(row[3]))
    f75.append(float(row[4]))
    f90.append(float(row[5]))
fage=array(fage)
f10=array(f10)
f25=array(f25)
f50=array(f50)
f75=array(f75)
f90=array(f90)
nf10=inter(fage,f10,nbage)
nf25=inter(fage,f25,nbage)
nf50=inter(fage,f50,nbage)
nf75=inter(fage,f75,nbage)
nf90=inter(fage,f90,nbage)

f_x,f_y = find_sleep_distribution(linus_age,nbage,nf10,nf25,nf50,nf75,nf90)
f_x=array(f_x)
f_y=array(f_y)/100.
fsigma=random_floats(1., 2.9, 10000)
fmu=random_floats(f_x[2]-0.5, f_x[2]+0.5, 10000)
dummycases=zeros([len(fsigma),len(sim_hours_d)])
nerror=zeros([len(fsigma)])
for i in range(len(fsigma)):
    dummycases[i,:]=fitFunc(sim_hours_d,fmu[i],fsigma[i])
    dummy=inter(sim_hours_d,dummycases[i,:],f_x)
    nerror[i] = (dot(f_y-dummy,f_y-dummy))**.5
fcases=error_th(nerror,dummycases,0.15)
totaltime=1./7.*(5.*weekday_day_hours + 2.*weekend_day_hours)
total_day=find_percent(totaltime,sim_hours_d,fcases)

Results¶

In [13]:
print
print " \t\t  mean \t\t  min \t \t  max  \t\t std"
print "\t\t\t\t\t Night"
print "Weekday:", '\t{:6.2f}'.format(mean(weekday_night)),'\t\t{:6.2f}'.format(min(weekday_night)),'\t\t{:6.2f}'.format(max(weekday_night)),'\t\t{:6.4f}'.format(std(weekday_night))
print "Weekend:", '\t{:6.2f}'.format(mean(weekend_night)),'\t\t{:6.2f}'.format(min(weekend_night)),'\t\t{:6.2f}'.format(max(weekend_night)),'\t\t{:6.4f}'.format(std(weekend_night))
print "Total:  ", '\t{:6.2f}'.format(mean(total_night)), '\t\t{:6.2f}'.format(min(total_night)), '\t\t{:6.2f}'.format(max(total_night)), '\t\t{:6.4f}'.format(std(total_night))
print
print "\t\t\t\t\t  Day"
print "Weekday:", '\t{:6.2f}'.format(mean(weekday_day)), '\t\t{:6.2f}'.format(min(weekday_day)), '\t\t{:6.2f}'.format(max(weekday_day)), '\t\t{:6.4f}'.format(std(weekday_day))
print "Weekend:", '\t{:6.2f}'.format(mean(weekend_day)), '\t\t{:6.2f}'.format(min(weekend_day)),'\t\t{:6.2f}'.format(max(weekend_day)),'\t\t{:6.4f}'.format(std(weekend_day))
print "Total:  ", '\t{:6.2f}'.format(mean(total_day)), '\t\t{:6.2f}'.format(min(total_day)), '\t\t{:6.2f}'.format(max(total_day)), '\t\t{:6.4f}'.format(std(total_day))
                  mean            min             max            std
                                         Night
Weekday:          0.20            0.17            0.23          0.0167
Weekend:          0.33            0.31            0.36          0.0120
Total:            0.18            0.16            0.21          0.0143

                                          Day
Weekday:          0.08            0.03            0.14          0.0299
Weekend:          0.12            0.05            0.18          0.0342
Total:            0.14            0.06            0.21          0.0390

After seeing the results, I calmed down, ate some cookies and drank some wine....after all. What I learned from this is simple: We have all have skills. Why not use them? To all our the nerd friends who have, will soon or want to have kids: Use your quantitative skills, and don't be afraid to show you model and the results to your doctor...at least I am not and he or she might get to laugh. Ending on a more serious note, maybe I am kidding myself. Crunching numbers and running models might be actions that are nothing else than feeding my addiction or distracting, but I argue that it is a lot better than freaking out based on a bunch of chat entries.

0 Comments

Using TSUFLIND

11/30/2016

28 Comments

 

TSUnami FLow INversion from Deposits

Hui Tang and I think that this is a clever name. In this blog, I would like to show what is needed to run the code. You can download the code from github.com. Note that the code is not pretty at the moment. Hui and I will be working on making it easier to understand. We plan to make it possible to interact with the code via an ipython/jupyter notebook instead of an input file and also actually write a real manual. To run TSUFLIND you need to have python with some extra libraries or modules installed. To keep this simple, you can download the full stack from Enthought or Anaconda. A quick google search will help you to get started on this. Here, I am going to assume that you have python available. TSUFLIND runs, at least theoretically, on any machine. We have tested it on OSX and Linux (my preference). I also tested it with the pythonista app on the iPad..it runs like a charm. For more information about TSUFLIND from a scientific point of view please read Tang and Weiss (2015, link).

1. Installation
To install the TSUFLIND, I will assume that you will use the command line to retrieve it from github. There are desktop Applications for those less familiar with the command line that you could use instead. If you want to brave the command line, here is how you do it:
  1. Navigate to the directory into which you want install TSUFLIND
  2. Type git clone https://github.com/CoastalHazardsVT/TSUFLIND_pd.git
  3. Enjoy.
TSUFLIND comes with preloaded example. So if you in the command line you can just type python main.py. If some terminal output appears and there is no error message you know that it is running and all the necessary packages are installed. Now you are ready to do your own case.

2. Parameters
All the parameter necessary to run TSUFLIND are in the file parameter_P14abc.txt. The filename does not make much sense, but please don't change it. If you open the file in a text editor, you will see that there are 59 lines in that file, and every line not containing # is important. However, most of the lines do not need to be changed and most of the parameters are self explanatory. The first important line is 37: water_run_up, which is exactly what you have. Line 39 is also important, which contains the slope given as 1/. Lines 37 and 39 are both used in the Soulsby model. If you do not know these values, please estimate them. While it is good to know these parameters in great detail, I will provide more information on that topic in a later blog post. Line 43 denotes the number of samples you have and line 47 gives the file name (sample_P14abc.csv) containing the respective grain-size distributions of the samples. The last important line is 53. The named file (test.csv) contains the location along the slope and the total thicknesses of the deposits.

Running TSUFLIND and Final Remarks
When you are done with your modifications of parameter_P14abc.txt and test.csv you will need to run main.py. From the command line you can type python main.py. If you see something like the following..you are good:
bash prompt$ python main.py
# Outputting to File[data_process.csv]:
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>]
# Outputting to File[sample00.csv]:
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>]
# Outputting to File[sample01.csv]:
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>]

and then:
====================================running====================================

...everything is fine. Just give it some time to do the magic.
Let me make some final remarks: TSUFLIND is a research code at is present stage and this not a full manual. However, it gives a software to "play" around and learn about sensitivities. Playing around with models is extremely important in order to learn what they can and cannot do. If you did not change any the files ending with py and the model does not run you did not screw up the model...just the input files. I hope to write another blog post about TSUFLIND soon that will serve more as a tutorial. If you have any questions, trouble, or want to use TSUFLIND for a specific example that you cannot get to run, please to not hesitate to contact me at weiszr@vt.edu
28 Comments

    Archives

    January 2017
    November 2016

    Categories

    All
    Inversion Modeling
    Nerd Parent
    Python
    Tsunami
    Tsunami Deposits

    RSS Feed

Powered by Create your own unique website with customizable templates.