Change a git commit from the past (not the most recent one)

Posted on February 14, 2022 by swk

# Figure out which commit you want to edit by getting its SHA.
git log

# Start an interactive rebase ($SHA = your commit's SHA and the ^ is important!).
git rebase --interactive $SHA^

# [Change 'pick' to 'edit' for your commit and save the buffer]

# [Add your changes with git add -p, etc.]

# Change the commit and optionally add --no-edit if you want to keep the existing message.
git commit --amend

# Finalize and apply the rebase.
git rebase --continue

# Or cancel the rebase and go back to what it was like before you started rebasing.
git rebase --abort

From Nick Janetakis – Change a Git Commit in the Past with Amend and Rebase Interactive (https://nickjanetakis.com/blog/change-a-git-commit-in-the-past-with-amend-and-rebase-interactive)

Run Spark locally and access S3

Posted on February 7, 2022 by swk

By changing the code:

val sparkConfig = new SparkConf()
   .set("fs.s3a.aws.credentials.provider", "com.amazonaws.auth.DefaultAWSCredentialsProviderChain")
   .setMaster("local[*]")

By adding JVM arguments to Java:

-Dspark.master=local[*]
-Dspark.hadoop.fs.s3a.aws.credentials.provider=com.amazonaws.auth.DefaultAWSCredentialsProviderChain

By setting the JVM property from Java (I have not tested if this works for the credentials provider, but it should):

System.setProperty("spark.master", "local[*]")
System.setProperty("spark.hadoop.fs.s3a.aws.credentials.provider", "com.amazonaws.auth.DefaultAWSCredentialsProviderChain")

The AWS credentials will be taken from the default profile or you can specify the profile with the environment variable AWS_PROFILE=<your profile.

Working configuration for Spark, Hadoop and AWS SDK

Posted on February 3, 2022 by swk

Tested combination that works version 1:
– Spark 2.4.4
– Hadoop 3.1.1
– AWS SDK 1.11.271
Tested combination that works version 2:
– Spark 2.4.4
– Hadoop 2.8.5
– AWS SDK 1.11.271

Yesterday's Coffee

Too good to throw away – too hard to remember

Monthly Archives: February 2022

Change a git commit from the past (not the most recent one)

Run Spark locally and access S3

Working configuration for Spark, Hadoop and AWS SDK