Modeling Earthquake Dynamics
In 2012, with Marilou Durand, student at UQAM, we have been working on the seismic gap hypothesis, see e.g. McCann et al. (1978) or Kagan & Jackson (1991), or to be more specific, on the dynamics...
View ArticleOn Some Alternatives to Regression Models
When you start discussing with people in machine learning, you quickly hear something like “forget your econometric models, your GLMs, I can easily find a machine learning ‘model’ that can beat yours”....
View ArticleClassification with Categorical Variables (the fuzzy side)
The Gaussian and the (log) Poisson regressions share a very interesting property, i.e. the average predicted value is the empirical mean of our sample. > mean(predict(lm(dist~speed,data=cars))) [1]...
View ArticleAn Attempt to Understand Boosting Algorithm(s)
Last tuesday, at the annual meeting of the French Economic Association, I was having lunch with Alfred, and while we were chatting about modeling issues (econometric models against machine learning...
View ArticleEconometrics vs. Machine Learning with Temporal Patterns
A few months ago, I did publish a (long) post entitled ‘some thoughts on economics, mathematics, econometrics, machine learning, etc‘. In that post, I was discussing possible differences between...
View ArticleComputational Actuarial Science, with R, in Barcelona
This Wednesday, I will give a graduate crash course on computational actuarial science, with R, which will be the second part of the lecture of Tuesday. Slides are now available,
View ArticleHow long could it take to run a regression
This afternoon, while I was discussing with Montserrat (aka @mguillen_estany) we were wondering how long it might take to run a regression model. More specifically, how long it might take if we use a...
View ArticleData Science for Actuaries, Regression Models with R
After an introduction to Advanced R, we will discuss for the last part of our crash course visualization and graphs (from the previous set of slides), and I just uploaded additional slides on...
View ArticleCe que la courbe ROC (et l’AUC) ne raconte pas
En préparant une intervention pour mardi prochain, j’épluchais les résultats renvoyés pour un exercice, et j’ai eu un résultat assez étrange avec un modèle de classification. J’avais donné la même base...
View ArticleWhat is a Linear Trend, by the way?
I had a very strange discussion on twitter (yes, another one), about regression curves. I think it started with a tweet based on some xkcd picture (just for fun, because it was New Year’s Day) “don’t...
View ArticleThe myth of interpretability of econometric models
There are important discussions nowadays about data modeling, to choose between the “two cultures” (as mentioned in Breiman (2001)), i.e. either econometrics models or machine/statistical learning...
View ArticleVisualizing effects of a categorical explanatory variable in a regression
Recently, I’ve been working on two problems that might be related to semiotic issues in predictive modeling (i.e. instead of a standard regression table, how can we plot coefficient values in a...
View ArticleOn the interpretation of a regression model
Yesterday, NaytaData (aka @NaytaData ) posted a nice graph on reddit, with bicycle traffic and mean air temperature, in Helsinki, Finland, per day, I found that graph interesting, so I did ask for the...
View ArticleClassification from scratch, overview 0/8
Before my course on « big data and economics » at the university of Barcelona in July, I wanted to upload a series of posts on classification techniques, to get an insight on machine learning tools....
View ArticleQuantile Regression (home made)
After my series of post on classification algorithms, it’s time to get back to R codes, this time for quantile regression. Yes, I still want to get a better understanding of optimization routines, in...
View ArticleLinear Regression, with Map-Reduce
Sometimes, with big data, matrices are too big to handle, and it is possible to use tricks to numerically still do the map. Map-Reduce is one of those. With several cores, it is possible to split the...
View ArticleParallelizing Linear Regression or Using Multiple Sources
My previous post was explaining how mathematically it was possible to parallelize computation to estimate the parameters of a linear regression. More speficially, we have a matrix \mathbf{X} which is...
View ArticleConvex Regression Model
This morning during the lecture on nonlinear regression, I mentioned (very) briefly the case of convex regression. Since I forgot to mention the codes in R, I will publish them here. Assume that...
View ArticleRégression sur une variable qualitative et ANOVA
Ce matin, pour le cours STT5100, on évoquait la régression sur une variable catégorielle. En particulier, on avait commencé par regarder ce que donnerait la régression sans la constante, et son...
View ArticleProbabilistic Foundations of Econometrics, part 2
This post is the second one of our series on the history and foundations of econometric and machine learning models. Part 1 is online here. Geometric Properties of this Linear Model Let’s define the...
View ArticleRandom thoughts on econometric models with (pure) random features
For my lectures on applied linear models, I wanted to illustrate the fact that the R^2 is never a good measure of the goodness of the model, since it’s quite easy to improve it. Consider the following...
View ArticleDonnées Agrégées et Variables Compositionnelles
Avec Enora Belz, nous venons de mettre en ligne une note méthodologique, Données Agrégées et Variables Compositionnelles, sur hal. La réforme du droit sur les données personnelles en Europe rend...
View ArticleOn leverage
Last week, in our STT5100 (applied linear models) class, I’ve introduce the hat matrix, and the notion of leverage. In a classical regression model, \boldsymbol{y}=\boldsymbol{X}\boldsymbol{\beta} (in...
View ArticleDe l’abus de notation dans les modèles de régression
De manière un peu rituelle, je commence toujours mon cours de régression en revenant sur un point important de la statistique : les abus de notation ! Car tout le monde utilise les mêmes lettres...
View ArticleDe la pratique de la régression
Depuis le début de la session, j’ai imposé une petite innovation, en donnant, environ une semaine sur deux, un petit exercice (obligatoire mais non noté) avant le cours, en vue de forcer à réfléchir...
View ArticleQuantile Regression (home made, part 2)
A few months ago, I posted a note with some home made codes for quantile regression… there was something odd on the output, but it was because there was a (small) mathematical problem in my equation....
View ArticleDu deuxième effet kiss-cool (régression multiple, scoring et évaluation)
Lorsque j’étais petit (il y a fort longtemps, à une époque où je regardais pas mal la télévision) il y avait une publicité pour les pastilles kiss cool, Et quand je présente la régression multiple à...
View ArticleRegression discontinuity model for TV series
In September, we are usually happy to see our favorite TV series back on air… Or not? Because, admit it, if we are happy to see those characters back, most of the time, we are disappointed, too. So why...
View ArticleDes régressions en cascade
Cette fin de semaine, je mettais en ligne un court billet du deuxième effet kiss-cool où je rappelais que quand on fait une régression sur plusieurs variables explicatives corrélées, ce n’est pas...
View ArticleFrom multinomial regression to binary classification on some Siamese data
There are two kinds of people in the world: people who think there are two kinds of people in the world and people who don’t (borrowed from Menand (2018)). Because things are always simpler when we...
View Article
More Pages to Explore .....